Polypeptides interactive with Bcl-XL

ABSTRACT

Described herein are methods and reagents for identifying polypeptides that bind to a Bcl-X L  polypeptide, and methods for identifying compounds that modulate the interaction between a Bcl-X L -binding polypeptide and a Bcl-X L  polypeptide.

[0001] This application claims the benefit of the filing date of UnitedStates provisional application, U.S. Ser. No. 60/274,526, filed Mar. 8,2001, hereby incorporated by reference.

BACKGROUND OF THE INVENTION

[0002] In general, the present invention relates to polypeptides thatbind to Bcl-X_(L), methods for identifying such polypeptides, andmethods for identifying compounds that modulate the interaction betweena Bcl-X_(L)-binding polypeptide and Bcl-X_(L) .

[0003] With the impending completion of the human genome sequence,interest is shifting to the emergent field of proteomics. One criticalaspect of proteomics is the creation of a comprehensive map ofprotein-protein interactions. Such interactions are responsible for mostsignal transduction, making them attractive targets for drug therapy.

[0004] The primary methodology currently in use for interaction mappingis the yeast two-hybrid assay. Recently, genome-wide efforts to mapprotein-protein interactions have been reported for S. cerevisiae and,to a more limited extent, for C. elegans (Ito et al., Proc. Natl. Acad.Sci. U.S.A. 97:1143-1147, 2000; Uetz et al., Nature 403:623-627, 2000;and Walhout et al., Science 287:116-122, 2000). In the two-hybrid assay,the interaction of two proteins brings together their respective fusionpartners, the DNA binding and activation domains of a transcriptionfactor such as GAL4. This interaction thereby increases thetranscription of a reporter gene that provides for the identification ofinteracting pairs.

[0005] While the yeast two-hybrid system has emerged as the leadingtechnology in the field of protein-protein interactions, it is notwithout significant limitations. Firstly, the yeast two-hybrid system islimited by the in vivo nature of the assay. Binding interactions musttake place under the conditions in the nucleus of the yeast cell, andmany extracellular proteins are unstable under these reducingconditions. In addition, proteins may prove toxic to the yeast throughinteractions with host cell proteins. Secondly, in order to generate asignal the two protein partners must be fused in an orientation thatallows productive binding. Thirdly, because the two-hybrid system is ascreening technique, there are practical limitations on the number ofcolonies that can be assayed.

[0006] Display technologies provide a powerful alternative and bypassmany of the limitations of the two-hybrid system (Zozulya et al., Nat.Biotechnol. 17:1193-1198, 1999). In display methods, the interactionbetween a library member and a target polypeptide occurs in vitro,allowing optimal binding conditions to be used for different targets.Additionally, large libraries are screened iteratively, thus allowingeven very low copy number proteins to be identified. However, in itsmost widely practiced form, phage display, this approach has similarlybeen hampered by the limitations of living systems. Specifically,libraries must be cloned, which decreases representation of the librarymembers, can lead to the loss of sequences unstable in E. coli, andrequires that proteins be properly processed to allow assembly of phageparticles. In addition, the generation of libraries large enough tocover the entire proteome is difficult.

SUMMARY OF THE INVENTION

[0007] The present invention features the application of mRNA display tothe identification of protein-protein interactions involving theanti-apoptotic protein Bcl-X_(L). The anti-apoptotic activity ofBcl-X_(L) is antagonized through binding to pro-apoptotic members of theBcl-2 family, and protein members of the Bcl-2 family have been proposedas targets for drug therapy (Kinscherf et al., Expert. Opin. Investig.Drugs 9:747-764, 2000; Mattson and Culmsee, Cell Tissue Res.301:173-187, 2000; and Chaudhary et al., Environ. Health Perspect. 107Suppl 1:49-57, 1999). Methods for identifying Bcl-X_(L)-bindingpolypeptides through mRNA display, as well as polypeptides identified asBcl-X_(L)-binding polypeptides and the nucleic acid sequences encodingsuch polypeptides are described herein.

[0008] Accordingly, in a first aspect, the invention features asubstantially pure human Bcl-X_(L)-binding polypeptide consisting of thesequence of any of SEQ ID NOS: 4-50, 63-71, and 224-228, or containingthe sequence of any of SEQ ID NOS: 51-62, 229, and 230, as well asisolated nucleic acid molecules encoding those polypeptides (that is,SEQ ID NOS: 4-71 and 224-230), and vectors and cells containing thoseisolated nucleic acid molecules. In one embodiment, the nucleic acidmolecule consists of the sequence of any of SEQ ID NOS: 156-202,215-223, and 231-235. In another embodiment, the nucleic acid moleculecontains the sequence of any of SEQ ID NOS: 203-214, 236, and 237. Inanother embodiment, the cell contains the vector into which an isolatednucleic acid molecule encoding a polypeptide of any of SEQ ID NOS: 4-71and 224-230 is incorporated.

[0009] In a second aspect, the invention features a method ofidentifying a Bcl-X_(L)-binding polypeptide. The method involvesproviding a population of source labeled nucleic acid-protein fusionmolecules; contacting the population of nucleic acid-protein fusionmolecules with a Bcl-X_(L) polypeptide under conditions that allowinteraction between the protein portion of a nucleic acid-protein fusionmolecule of the population and the Bcl-X_(L) polypeptide; and detectingan interaction between the protein portion and the Bcl-X_(L)polypeptide, thereby identifying a Bcl-X_(L)-binding polypeptide. In apreferred embodiment, the population of source labeled nucleicacid-protein fusion molecules is derived from more than one source. Inanother preferred embodiment, the nucleic acid-protein fusion moleculesare detectably-labeled. In yet another preferred embodiment, theBcl-X_(L) polypeptide is immobilized on a solid support, and thedetection of an interaction between the protein portion of a nucleicacid-protein fusion molecule and a Bcl-X_(L) polypeptide is carried outby detecting the labeled nucleic acid-protein fusion molecule bound tothe solid support. In this case, the support is preferably a bead or achip.

[0010] In a third aspect, the invention features a method of identifyinga compound that modulates binding between a Bcl-X_(L) polypeptide and aBcl-X_(L)-binding polypeptide. The method entails contacting a Bcl-X_(L)polypeptide with (i) a Bcl-X_(L)-binding polypeptide consisting of thesequence of any of SEQ ID NOS: 4-50, 63-71, and 224-228, or containingthe sequence of any one of SEQ ID NOS: 51-62, 229, and 230, and (ii) acandidate compound, under conditions that allow binding between theBcl-X_(L) polypeptide and the Bcl-X_(L)-binding polypeptide. The levelof binding between the Bcl-X_(L) polypeptide and the Bcl-X_(L)-bindingpolypeptide is then determined. An increase or decrease in the level ofbinding between the Bcl-X_(L) polypeptide and the Bcl-X_(L)-bindingpolypeptide, relative to the level of binding between the Bcl-X_(L)polypeptide and the Bcl-X_(L)-binding polypeptide in the absence of thecandidate compound, indicates a compound that modulates the interactionbetween a Bcl-X_(L) polypeptide and a Bcl-X_(L)-binding polypeptide. Themodulation may be an increase or a decrease in binding between theBcl-X_(L) polypeptide and the Bcl-X_(L)-binding polypeptide.

[0011] In one embodiment of this aspect of the invention, theBcl-X_(L)-binding polypeptide is part of a nucleic acid-protein fusionmolecule. In a preferred embodiment, the Bcl-X_(L)-binding polypeptideis a free polypeptide that is not part of a fusion. In another preferredembodiment, the Bcl-X_(L) polypeptide is attached to a solid support. Inyet another preferred embodiment, the Bcl-X_(L)-binding polypeptide isdetectably-labeled, and the level of binding between the Bcl-X_(L)polypeptide and the Bcl-X_(L)-binding polypeptide is determined bymeasuring the amount of Bcl-X_(L)-binding protein that binds to thesolid support. Preferably, the solid support is a chip or a bead.

[0012] In a fourth aspect, the invention features a method ofsource-labeling a nucleic acid-protein fusion molecule. This methodinvolves providing an RNA molecule; generating a first cDNA strand usingthe RNA molecule as a template; generating a second cDNA strandcomplementary to the first cDNA strand, the second cDNA strand furtherincluding a nucleic acid sequence that identifies the source of the RNAmolecule; generating a source labeled RNA molecule from the doublestranded cDNA molecule of the previous step; attaching a peptideacceptor to the source labeled RNA molecule generated in the previousstep; and in vitro translating the RNA molecule to generate a sourcelabeled nucleic acid-protein fusion molecule.

[0013] In a related aspect, the invention features a source-labelednucleic acid-protein fusion molecule, where the nucleic acid portion ofthe fusion molecule contains a coding sequence for the protein and alabel that identifies the source of the nucleic acid portion of thefusion molecule.

[0014] In another related aspect, the invention features a method ofidentifying the source of the nucleic acid portion of a nucleicacid-protein fusion molecule. The method includes providing a populationof nucleic acid-protein fusion molecules, each molecule containing asource label that identifies the source of the nucleic acid portion ofthe fusion; and determining the identity of the source label, therebyidentifying the source of the nucleic acid portion of a nucleicacid-protein fusion molecule. In preferred embodiments, the source labelis cell type-specific, tissue-specific, or species-specific. In anotherpreferred embodiment, the population of nucleic acid-protein fusionmolecules contains subpopulations of nucleic acid-protein fusionmolecules from a plurality of sources.

[0015] In any of the above aspects of the invention, the nucleicacid-protein fusion molecule is preferably an RNA-protein fusionmolecule, for example, as described by Roberts and Szostak (Proc. Natl.Acad. Sci. U.S.A. 94:12297-302, 1997) and Szostak et al. (WO 98/31700;and U.S. Ser. No. 09/247,190), hereby incorporated by reference.Alternatively, the nucleic acid-protein fusion molecule is a DNA-proteinfusion molecule, for example a cDNA-protein fusion molecule. Suchmolecules are described, for example, in U.S. Ser. No. 09/453,190 and WO00/32823, hereby incorporated by reference.

[0016] By “nucleic acid-protein fusion molecule” is meant a nucleic acidmolecule covalently bound to a protein. The nucleic acid molecule may bean RNA or DNA molecule, or may include RNA or DNA analogs at one or morepositions in the sequence. The “protein” portion of the fusion iscomposed of two or more naturally occurring or modified amino acidsjoined by one or more peptide bonds. “Protein,” “peptide,” and“polypeptide” are used interchangeably herein.

[0017] By “substantially pure polypeptide” or “substantially pure andisolated polypeptide” is meant a polypeptide (or a fragment thereof)that has been separated from components that naturally accompany it.Typically, the polypeptide is substantially pure when it is at least60%, by weight, free from the proteins and naturally-occurring organicmolecules with which it is naturally associated. Preferably, thepolypeptide is a Bcl-X_(L)-binding polypeptide that is at least 75%,more preferably, at least 90%, and most preferably, at least 99%, byweight, pure. A substantially pure Bcl-X_(L)-binding polypeptide may beobtained, for example, by extraction from a natural source (e.g., acell), by expression of a recombinant nucleic acid encoding aBcl-X_(L)-binding polypeptide, or by chemically synthesizing thepolypeptide. Purity can be measured by any appropriate method, e.g., bycolumn chromatography, polyacrylamide gel electrophoresis, or HPLCanalysis.

[0018] A protein is substantially free of naturally associatedcomponents when it is separated from those contaminants that accompanyit in its natural state. Thus, a protein that is chemically synthesizedor produced in a cellular system different from the cell from which itnaturally originates will be substantially free from its naturallyassociated components. Accordingly, substantially pure polypeptides notonly include those derived from eukaryotic organisms but also thosesynthesized in E. coli or other prokaryotes.

[0019] By a “Bcl-X_(L)-binding polypeptide” is meant a polypeptide thatinteracts with a Bcl-X_(L) polypeptide or a fragment of a Bcl-X_(L)polypeptide. The interaction of a Bcl-X_(L)-binding polypeptide with aBcl-X_(L) polypeptide can be detected using binding assays describedherein, or any other assay known to one skilled in the art. In addition,a Bcl-X_(L)-binding polypeptide may be contained in the protein portionof a nucleic acid protein fusion molecule.

[0020] By a “Bcl-X_(L) polypeptide” is meant a polypeptide that issubstantially identical to the polypeptide sequence of GenBank AccessionNumber: Z23115, or a fragment thereof. For example, a Bcl-X_(L)polypeptide may consist of amino acids 1 to 211 of GenBank AccessionNumber: Z23115.

[0021] By “substantially identical” is meant a nucleic acid moleculeexhibiting at least 50%, preferably, 60%, more preferably, 70%, stillmore preferably, 80%, and most preferably, 90% identity to a referencenucleic acid sequence or polypeptide. For comparison of nucleic acidmolecules, the length of sequences for comparison will generally be atleast 30 nucleotides, preferably, at least 50 nucleotides, morepreferably, at least 60 nucleotides, and most preferably, the fulllength nucleic acid molecule. For comparison of polypeptides, the lengthof sequences for comparison will generally be at least 10 amino acids,preferably, at least 15 nucleotides, more preferably, at least 20 aminoacids, and most preferably, the full length polypeptide.

[0022] The “percent identity” of two nucleic acid or polypeptidesequences can be readily calculated by known methods, including but notlimited to those described in Computational Molecular Biology, Lesk, A.M., ed., Oxford University Press, New York, 1988; Biocomputing:Informatics and Genome Projects, Smith, D. W., ed., Academic Press, NewYork, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A. M.,and Griffin, H. G., eds., Humana Press, New Jersey, 1994; SequenceAnalysis in Molecular Biology, von Heinje, Academic Press, 1987; andSequence Analysis Primer, Gribskov, and Devereux, eds., M. StocktonPress, New York, 1991; and Carillo and Lipman, SIAM J. Applied Math. 48:1073, 1988.

[0023] Methods to determine identity are available in publicly availablecomputer programs. Computer program methods to determine identitybetween two sequences include, but are not limited to, the GCG programpackage (Devereux et al., Nucleic Acids Research 12(1): 387, 1984),BLASTP, BLASTN, and FASTA (Altschul et al., J. Mol. Biol. 215: 403(1990). The well known Smith Waterman algorithm may also be used todetermine identity. The BLAST program is publicly available from NCBIand other sources (BLAST Manual, Altschul, et al., NCBI NLM NIHBethesda, Md. 20894). Searches can be performed in URLs such as thefollowing http://www.ncbi.nlm.nih.gov/BLAST/unfinishedgenome.html; orhttp://www.tigr.org/cgi-bin/BlastSearch/blast.cgi. These softwareprograms match similar sequences be assigning degrees of homology tovarious substitutions, deletions, and other modifications. Conservativesubstitutions typically include substitutions within the followinggroups: glycine, alanine; valine, isoleucine, leucine; aspartic acid,glutamic acid, asparagine, glutamine; serine, threonine; lysine,arginine; and phenylalanine, tyrosine.

[0024] By a “compound,” “test compound,” or “candidate compound” ismeant a chemical molecule, be it naturally-occurring orartificially-derived, and includes, for example, peptides, proteins,synthetic organic molecules, naturally-occurring organic molecules,nucleic acid molecules, and components thereof.

[0025] By a “solid support” is meant any solid surface including,without limitation, any chip (for example, silica-based, glass, or goldchip), glass slide, membrane, bead, solid particle (for example,agarose, Sepharose, polystyrene or magnetic bead), column (or columnmaterial), test tube, or microtiter dish.

[0026] By a “microarray” or “array” is meant a fixed pattern ofimmobilized objects on a solid surface or membrane. As used herein, thearray is made up of polypeptides immobilized on the solid surface ormembrane. “Microarray” and “array” are used interchangeably. Preferably,the microarray has a density of between 10 and 1,000 objects/cm².

[0027] By “detectably-labeled” is meant any means for marking andidentifying the presence of a molecule, e.g., an oligonucleotide probeor primer, a gene or fragment thereof, a cDNA molecule, or an antibody.Methods for detectably-labeling a molecule are well known in the art andinclude, without limitation, radioactive labeling (e.g., with an isotopesuch as ³²P or ³⁵S) and nonradioactive labeling (e.g., with afluorescent label, such as fluorescein, or a chemiluminescent label).

[0028] By a “source label” is meant a nucleic acid sequence that isattached to a nucleic acid-protein fusion molecule. The source labelidentifies the origin of the nucleic acid portion of a nucleicacid-protein fusion molecule. For example, a source label can identify aspecific cell type, tissue type, or species from which the nucleic acidportion of a nucleic acid-protein fusion molecule is derived. The sourcelabel also permits the selection of nucleic acid-protein fusionmolecules from a particular source from a pool of nucleic acid-proteinfusion molecules from various sources. For example, a primer or a probecan be designed to detect the source label of nucleic acid-proteinfusion molecules from a particular source, thereby allowingamplification or detection by hybridization of those particular fusionmolecules. Such a primer or probe can also be designed for use as ahandle for purification of a nucleic acid molecule or a nucleicacid-protein fusion molecule.

[0029] By “sequence cluster” is meant a group of sequences that form acontinuous single sequence when their overlapping sequences are aligned.For example, a cluster sequence can be a set of sequences that eachcontain sequences in common with the other members of the sequencecluster. Sequence clusters can be formed using, for example, thecomputer program MacVector.

[0030] By “high stringency conditions” is meant conditions that allowhybridization comparable with the hybridization that occurs using a DNAprobe of at least 500 nucleotides in length, in a buffer containing 0.5M NaHPO₄, pH 7.2, 7% SDS, 1 mM EDTA, and 1% BSA (fraction V), at atemperature of 65° C., or a buffer containing 48% formamide, 4.8× SSC,0.2 M Tris-Cl, pH 7.6, 1× Denhardt's solution, 10% dextran sulfate, and0.1% SDS, at a temperature of 42° C. (these are typical conditions forhigh stringency Northern or Southern hybridizations). High stringencyhybridization is also relied upon for the success of numerous techniquesroutinely performed by molecular biologists, such as high stringencyPCR, DNA sequencing, single strand conformational polymorphism analysis,and in situ hybridization. In contrast to Northern and Southernhybridizations, these techniques are usually performed with relativelyshort probes (e.g., usually 16 nucleotides or longer for PCR orsequencing, and 40 nucleotides or longer for in situ hybridization). Thehigh stringency conditions used in these techniques are well known tothose skilled in the art of molecular biology, and may be found, forexample, in Ausubel et al., Current Protocols in Molecular Biology, JohnWiley & Sons, New York, N.Y., 1998, hereby incorporated by reference.

[0031] By “transgene” is meant any piece of DNA that is inserted byartifice into a cell, and becomes part of the genome of the organismthat develops from that cell. Such a transgene may include a gene thatis partly or entirely heterologous (i.e., foreign) to the transgenicorganism, or may represent a gene homologous to an endogenous gene ofthe organism.

[0032] By “transgenic” is meant any cell that includes a DNA sequencethat is inserted by artifice into a cell and becomes part of the genomeof the organism that develops from that cell. As used herein, thetransgenic organisms are generally transgenic mammals (e.g., mice, rats,and goats) and the DNA (transgene) is inserted by artifice into thenuclear genome.

[0033] By “knockout mutation” is meant an artificially-inducedalteration in the nucleic acid sequence (created via recombinant DNAtechnology or deliberate exposure to a mutagen) that reduces thebiological activity of the polypeptide normally encoded therefrom by atleast 80% relative to the unmutated gene. The mutation may, withoutlimitation, be an insertion, deletion, frameshift mutation, or amissense mutation. The knockout mutation can be in a cell ex vivo (e.g.,a tissue culture cell or a primary cell) or in vivo.

[0034] A “knockout animal” is a mammal, preferably, a mouse, containinga knockout mutation as defined above.

[0035] By “transformation,” “transfection,” or “transduction” is meantany method for introducing foreign molecules into a cell, e.g., abacterial, yeast, fungal, algal, plant, insect, or animal cell.Lipofection, DEAE-dextran-mediated transfection, microinjection,protoplast fusion, calcium phosphate precipitation, retroviral delivery,electroporation, and biolistic transformation are just a few of themethods known to those skilled in the art which may be used. Inaddition, a foreign molecule can be introduced into a cell using a cellpenetrating peptide, for example, as described by Fawell et al. (Proc.Natl. Acad. Sci. U.S.A. 91:664-668, 1994) and Lindgren et al. (TIPS21:99-103, 2000).

[0036] By “transformed cell,” “transfected cell,” or “transduced cell,”is meant a cell (or a descendent of a cell) into which a nucleic acidmolecule encoding a polypeptide of the invention has been introduced, bymeans of recombinant nucleic acid techniques.

[0037] By “promoter” is meant a minimal sequence sufficient to directtranscription. If desired, constructs of the invention may also includethose promoter elements that are sufficient to render promoter-dependentgene expression controllable in a cell type-specific, tissue-specific,or temporal-specific manner, or inducible by external signals or agents;such elements may be located in the 5′ or 3′ or intron sequence regionsof the native gene.

[0038] By “operably linked” is meant that a gene and one or moreregulatory sequences are connected in such a way as to permit geneexpression when the appropriate molecules (e.g., transcriptionalactivator proteins) are bound to the regulatory sequences.

[0039] By “sample” is meant a tissue biopsy, cells, blood, serum, urine,stool, or other specimen obtained from a patient or test subject. Thesample is analyzed to detect a mutation in a gene encoding aBcl-X_(L)-binding polypeptide, or expression levels of a gene encoding aBcl-X_(L)-binding polypeptide, by methods that are known in the art. Forexample, methods such as sequencing, single-strand conformationalpolymorphism (SSCP) analysis, or restriction fragment lengthpolymorphism (RFLP) analysis of PCR products derived from a patientsample may be used to detect a mutation in a gene encoding aBcl-X_(L)-binding polypeptide; ELISA may be used to measure levels of aBcl-X_(L)-binding polypeptide; and PCR may be used to measure the levelof nucleic acids encoding a Bcl-X_(L)-binding polypeptide.

[0040] By “apoptosis” is meant cell death characterized by any of thefollowing properties: nuclear condensation, DNA fragmentation, membraneblebbing, or cell shrinkage.

[0041] By “modulating” is meant either increasing (“upward modulating”)or decreasing (“downward modulating”) the number of cells that undergoapoptosis in a given cell population. Preferably, the cell population isselected from a group including cancer cells (e.g., ovarian cancercells, breast cancer cells, pancreatic cancer cells), leukemic cells,lymphoma cells, T cells, neuronal cells, fibroblasts, or any other cellline known to proliferate in a laboratory setting. It will beappreciated that the degree of apoptosis modulation provided by anapoptosis modulating compound in a given assay will vary, but that oneskilled in the art can determine the statistically significant change inthe level of apoptosis that identifies a compound that increases ordecreases apoptosis. Preferably, for downward modulating, apoptosis isdecreased by least 20%, more preferably, by at least, 40%, 50%, or 75%,and, most preferably, by at least 90%, relative to a control samplewhich was not administered an apoptosis downward modulating testcompound. Also as used herein, preferably, for upward modulating,apoptosis is increased by at least 1.5-fold to 2-fold, more preferably,by at least 3-fold, and most preferably, by at least 5-fold, relative toa control sample which was not administered an apoptosis upwardmodulating test compound.

[0042] By an “apoptotic disease” is meant a condition in which theapoptotic response is abnormal. This may pertain to a cell or apopulation of cells that does not undergo cell death under appropriateconditions. For example, normally a cell will die upon exposure toapoptotic-triggering agents, such as chemotherapeutic agents, orionizing radiation. When, however, a subject has an apoptotic disease,for example, cancer, the cell or a population of cells may not undergocell death in response to contact with apoptotic-triggering agents. Inaddition, a subject may have an apoptotic disease when the occurrence ofcell death is too low, for example, when the number of proliferatingcells exceeds the number of cells undergoing cell death, as occurs incancer when such cells do not properly differentiate.

[0043] An apoptotic disease may also be a condition characterized by theoccurrence of inappropriately high levels of apoptosis. For example,certain neurodegenerative diseases, including but not limited toAlzheimer's disease, Huntington's disease, Parkinson's disease,amyotrophic lateral sclerosis, multiple sclerosis, restenosis, stroke,and ischemic brain injury are apoptotic diseases in which neuronal cellsundergo undesired cell death.

[0044] By “proliferative disease” is meant a disease that is caused byor results in inappropriately high levels of cell division,inappropriately low levels of apoptosis, or both. For example, cancerssuch as lymphoma, leukemia, melanoma, ovarian cancer, breast cancer,pancreatic cancer, and lung cancer are all examples of proliferativedisease.

[0045] By a “substantially pure nucleic acid,” “isolated nucleic acid,”or “substantially pure and isolated nucleic acid” is meant nucleic acid(for example, DNA) that is free of the genes which, in thenaturally-occurring genome of the organism from which the nucleic acidof the invention is derived, flank the nucleic acid. The term thereforeincludes, for example, a recombinant DNA that is incorporated into avector; into an autonomously replicating plasmid or virus; or into thegenomic DNA of a prokaryote or eukaryote; or that exists as a separatemolecule (e.g., a cDNA or a genomic or cDNA fragment produced by PCR orrestriction endonuclease digestion) independent of other sequences. Italso includes a recombinant DNA that is part of a hybrid gene encodingadditional polypeptide sequence.

[0046] By “antisense,” as used herein in reference to nucleic acids, ismeant a nucleic acid sequence, regardless of length, that iscomplementary to the coding strand of a nucleic acid molecule encoding aBcl-X_(L) polypeptide or a Bcl-X_(L)-binding polypeptide. Preferably,the antisense nucleic acid molecule is capable of modulating apoptosiswhen present in a cell. Modulation of at least 10%, relative to acontrol, is recognized; preferably, the modulation is at least 25%, 50%,or more preferably, 75%, and most preferably, 90% or more.

[0047] By a “purified antibody” is meant an antibody that is at least60%, by weight, free from proteins and naturally occurring organicmolecules with which it is naturally associated. Preferably, thepreparation is at least 75%, more preferably, 90%, and, most preferably,at least 99%, by weight, antibody, e.g., a Bcl-X_(L)-bindingpolypeptide-specific antibody. A purified antibody may be obtained, forexample, by affinity chromatography using recombinantly-produced proteinor conserved motif peptides and standard techniques.

[0048] By “specifically binds” is meant a compound that recognizes andbinds a protein or polypeptide, for example, a Bcl-X_(L) polypeptide ora Bcl-X_(L)-binding polypeptide, and that when detectably labeled can becompeted away for binding to that protein or polypeptide by an excess ofcompound that is not detectably labeled. A compound thatnon-specifically binds is not competed away by the above excessdetectably labeled compound.

[0049] The present invention has several utilities. Since the Bcl-2family of proteins, and Bcl-X_(L) itself, has been implicated inapoptosis, these Bcl-2 family polypeptides can be used in screens fortherapeutics that modulate diseases or developmental abnormalitiesinvolving overactivity or underactivity of apoptotic pathways. Inparticular, Bcl-X_(L) is known to protect cancer cells (e.g., pancreaticcarcinoma cells) from stimulation of apoptosis, and this effect isreversible by adding an agent, Bax (Hinz et al., Oncogene 19:5477-5486,2000), that binds to Bcl-X_(L) at the same site as many of thepolypeptides of the present invention. Therefore, the polypeptides thatbind to Bcl-X_(L), described herein, may be used as targets intherapeutics screening assays. The identified polypeptides areparticularly useful in such screens because they represent thefunctional portions of human proteins that bind Bcl-X_(L). Thesepolypeptides may also be used to detect Bcl-X_(L) polypeptides in asample. In addition, the methods of the present invention are useful ashigh-throughput screening methods for potential therapeutics involved inthe overactivity or underactivity of apoptotic pathways.

[0050] The general approach of the present invention also provides anumber of advantages. For example, direct mRNA display allows themapping of protein-protein interactions, which is useful for drugscreening. In mRNA display (Roberts and Szostak, supra), a DNA templateis used to transcribe an engineered-mRNA molecule possessing suitableflanking sequences (e.g., a promoter; a functional 5′ UTR to allowribosome binding; a start codon; an open reading frame; a sequence forpolypeptide purification; and a conserved sequence used for ligation toa complementary linker containing puromycin). To the 3′ end of the mRNA,a linker strand with a puromycin moiety (Pu) is then added, preferablyby photo-crosslinking. When this RNA is translated in vitro, thepuromycin becomes incorporated at the C-terminus of the nascent peptide.The resulting mRNA display construct is then purified after ribosomedissociation. A cDNA strand is then synthesized to protect the RNA andto provide a template for future PCR amplification. A library of suchconstructs can be incubated with immobilized target, and molecules thatbind are enriched by washing away unbound material. Bound cDNAs arerecovered, for example, by KOH elution, and subsequent PCR is performedto regenerate a library enriched for target-binding peptides. FIG. 1shows the steps involved in mRNA display.

[0051] As mRNA display is a completely in vitro technique, many of theproblems inherent in cloning and expression are eliminated. Theelimination of cloning bottlenecks in library preparation allows thegeneration of very large libraries, routinely in the range of 10¹³members. In addition, the formation of mRNA display constructs isreadily achieved in a mammalian expression system, thereby providingsuitable chaperones for the folding of human proteins and the potentialfor appropriate post-translational modifications.

[0052] Other features and advantages of the invention will be apparentfrom the following detailed description, and from the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0053]FIG. 1 is a schematic representation of iterative selection usingmRNA display.

[0054]FIG. 2A shows the sequences of positive control polypeptides usedin Bcl-X_(L) polypeptide binding assays (SEQ ID NOS: 238-240).

[0055]FIG. 2B is a graph showing the binding of control polypeptides toa Bcl-X_(L) polypeptide.

[0056]FIG. 3A shows polypeptides identified as Bcl-X_(L) bindingpolypeptides using methods described herein (SEQ ID NOS: 1-71), as wellas information on the binding affinity and specificity of thepolypeptides. In addition, the number of clones of each sequence clusterobtained from each library is presented.

[0057]FIG. 3B shows the polypeptide sequences of Bcl-X_(L)-bindingpolypeptides (SEQ ID NOS: 1-71), and indicates corresponding nucleicacid sequences.

[0058]FIG. 3C shows the nucleotide sequences of selectedBcl-X_(L)-binding polypeptides (SEQ ID NOS: 72-142). The nucleotidesequences encoding the selected Bcl-X_(L)-binding polypeptides areunderlined (SEQ ID NOS: 153-223).

[0059]FIG. 4 is a schematic representation of the alignment of selectedBcl-X_(L)-binding polypeptides within their parental proteins. Eachunique fragment was analyzed to determine the location of the amino- andcarboxyl-termini within the parental protein sequence, and these aminoacids are indicated by residue and number. The number of isolated clonescorresponding to each unique fragment was determined and is indicatednext to the fragment ID. These fragments are mapped against the parentalsequences of Bim, Bax, HSPC300, and TPR (SEQ ID NOS: 241-244). The BH3domain core sequence is underlined for the BimL and Bax proteins. Splicevariants are indicated by a * in the ID and the use of (−) in place of(=) in the fragment map.

[0060]FIG. 5 is a graph of the relative binding affinity of a selectedBak Bcl-X_(L)-binding polypeptide to immobilized Bcl-X_(L) polypeptideversus concentration of immobilized Bcl-X_(L) polypeptide.

[0061]FIG. 6 is a graph of the effect of the binding of aBcl-X_(L)-binding polypeptide in the presence of a competitor BH3 domainfrom the Bcl-2 family member Bak.

[0062]FIG. 7 shows the polypeptide sequences of representative clones(SEQ ID NOS: 1, 5, 245, 60, 61, 6, 46, 2, 33, 4, 7-10, 3, 11, 48, 12,53, and 54) from sequence clusters that were bound to a Bcl-X_(L)polypeptide in the presence of a competitor BH3 domain from the Bcl-2family member Bak. Competitive binding was determined relative to acontrol containing no competitor. The selected polypeptide sequences areshown aligned by sequence homology, where possible, to the known BH3domains of Bim, Bak, and Bax.

[0063]FIGS. 8A, 8B, and 8C are tables of amino acid sequences that bindBcl-X_(L) protein (FIGS. 8A and 8B; SEQ ID NOS: 224-230) and theirnucleic acid coding sequences (FIG. 8C; SEQ ID NOS: 231-237).

[0064]FIG. 9 is a graph showing free peptide binding to GST-BCL-X_(L),as compared to background binding to GST. Bax was used as a positivecontrol for Bcl-X_(L) binding.

[0065] Described herein are methods for identifying polypeptides thatinteract with a Bcl-X_(L) polypeptide; methods for identifying compoundsthat increase or decrease the binding between a Bcl-X_(L) polypeptideand a Bcl-X_(L)-binding polypeptide; methods for source labeling anucleic acid-protein fusion molecule; and methods for identifying thesource of the nucleic acid portion of a nucleic acid-protein fusionmolecule. Techniques for carrying out each method of the invention arenow described in detail, using particular examples. The examples areprovided for the purpose of illustrating the invention, and should notbe construed as limiting. Also described herein are novelBcl-X_(L)-binding polypeptides and nucleic acid molecules obtainedthrough the methods of the present invention.

[0066] Materials and Methods for Identifying Bcl-X_(L)-BindingPolypeptides

[0067] The experiments described herein were carried out using thematerials and methods described below.

[0068] Choice of UTR Sequence Tags

[0069] Unique UTR sequences that are compatible with translation inrabbit reticulocyte lysate were identified by selection from a libraryof c-myc mRNAs with a partially randomized 5′ UTR. The c-myc constructdescribed by Roberts and Szostak (supra) was amplified by PCR using the5′ primer TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA TTT ACA ATT HHH HHHHHA CAA TGG CTG AAG AAC AGA AAC TG (where H is an equimolar mixture ofA, C, and T) (SEQ ID NO: 143). This amplification inserted 8 randombases into the 5′ UTR upstream of the ATG start codon, to give a libraryof 3⁸ (6561) different mRNA molecules after in vitro transcription withT7 RNA polymerase. Fusion formation, reverse transcription, andimmunoprecipitation with an anti-c-myc antibody were carried out asdescribed by Roberts and Szostak (supra) to separate mRNAs that hadundergone translation from those that had not. The successfullytranslated and fused sequences were amplified by PCR using the 5′ primerTAATACGACTCA CTATAGGGACAATTACTATTTACAATT (SEQ ID NO: 144), in which theT7 promoter is underlined, to preserve the information in the randomizedregion. Sequences obtained from individual clones were subsequently usedin the construction of tagged libraries.

[0070] Library Preparation

[0071] The design of the above-described sequence tags can be adapted tosource label nucleic acid sequences from various sources. Instead ofeach sequence tag being a unique sequence (as described above), onesequence tag (source label) is used to label a set of nucleic acidsequences derived from the same cell, tissue, or species. The sourcelabeled sequences can then be pooled with different source labeledsequences and used for mRNA display as described herein, and the originof each sequence in the pool can be determined.

[0072] Individual RNA sequences are translated in vitro, and RNA-proteinfusions are formed, for example, according to the methods of Roberts andSzostak (supra) and Szostak et al. (WO 98/31700; U.S. Ser. No.09/247,190), hereby incorporated by reference. Specifically, each mRNAdisplay library was prepared according to the following methods.Poly-A+mRNA (Clontech) was primed using the oligonucleotidesGGAACTTGCTTCGTCTTTGCAATCN₉ (SEQ ID NO: 145) orGGATGATGCTTCGTCTTTGTAATCN₉ (SEQ ID NO: 146) and a cDNA molecule wassynthesized using SuperScript II Reverse Transcriptase (Promega). Twoprimers were used initially, to allow the investigation of differentligation sequences; these sequences were subsequently altered and madeuniform by the use of a single PCR primer under conditions that wouldallow it to anneal to either template. The RNA/cDNA hybrid molecule wasthen treated with RNase H in order to partially degrade the RNA memberof the hybrid molecule. Unextended primers were then removed bypurification over an S-300 (Pharmacia) size exclusion column.

[0073] Second strand cDNA synthesis was carried out by the Klenowfragment of E. coli DNA polymerase, using primers having the sequenceGGACAATTACTATTTACAATT[H₈]ACAATGN₉ (SEQ ID NO: 147) that included a 5′UTR with a sequence tag H₈ (source label), derived as described above,and a start codon (underlined). In the production of libraries fromhuman kidney, liver, bone marrow, and brain mRNAs, the source labelsCTCCTAAC (SEQ ID NO: 250), CTTTCTCT (SEQ ID NO: 251), CTTACTTC (SEQ IDNO: 252), and ATTTCAAT (SEQ ID NO: 253) were used, respectively.Unextended primers were again removed by S-300 size exclusionchromatography, and the cDNA product was then PCR amplified using aforward primer encoding the T7 promoter (underlined) and 5′ UTR,TAATACGACTCACTATAGGGACAATTACTATTTACAATT (SEQ ID NO: 148), and reverseprimers corresponding to the fixed regions of the first strand primersabove. After PCR product purification using spin columns (Qiagen), shortfragments were removed by S-300 size exclusion chromatography.

[0074] mRNA Display Construct Formation

[0075] The above described PCR products were reamplified using theforward primer described above (TAATACGACTCACTATAGGGACAATTACTATTTACAATT)(SEQ ID NO: 148) and a single reverse primer,TTTTAAATAGCGCATGCCTTATCGTCATCGTCTTTGTAATC (SEQ ID NO: 149), encoding theFLAG-M2 epitope (underlined) and a region complementary to thephotoligation linker (italics). The single reverse primer was used toamplify libraries containing each of the initial first strand primersequences in order to produce a single uniform end. These amplicons werethen used as templates for transcription using T7 RNA polymerase (AmbionMegaScript). The resulting RNA molecules were purified byphenol/chloroform/isoamyl alcohol extraction and NAP column (Pharmacia)purification. The puromycin-containing linker 5′-Pso-TAGCGGATGCA₁₈XXCCPu(where X is PEG spacer 9; Pso is psoralen; and Pu is puromycin) wasphoto-ligated to the 3′ end of the RNA essentially as described by Kurzet al. (Nucleic Acids Res. 28:E83, 2000). Ligated RNA molecules werethen translated for 30 min at 30° C. in a 300 μl reaction containing 200μl of rabbit reticulocyte lysate (Ambion), 120 pmole of ligated RNA, 10μl of an amino acid mix lacking methionine (Ambion), and 15 μl of³⁵S-met (Amersham). Subsequently, 100 μl of 2M KCl and 25 μl of 1 MMgCl₂ was added to facilitate formation of the mRNA display complex. ThemRNA display constructs were then purified by binding to 100 μl of 50%oligo-dT cellulose slurry in a total volume of 10 ml (100 mM Tris-HCl(pH 8), 10 mM EDTA, 1M NaCl, and 0.25% Triton X-100) at 4° C. for 1 hr.The binding reaction was then transferred to a column (BIORAD), washed 3times with 1 ml of binding buffer containing no EDTA, then eluted with100 μl aliquots of 2 mM Tris-HCl (pH 8), 0.05% Triton X-100, and 5 mg/mlBSA.

[0076] A cDNA strand was synthesized using SuperScript II RT (Promega)and the reverse sense PCR primer in the manufacturer supplied buffer.The reverse transcription reaction was then diluted to 1 ml in TBKbuffer (50 mM Tris-HCl (pH 7.5), 150 mM KCl, and 0.05% Triton X-100) andincubated with 200 μl of anti-FLAG Ab immobilized on agarose beads(Sigma) for 1 hr at 4° C. The binding reaction was transferred to acolumn and the beads were washed 3 times with 1 ml of TBK buffer.mRNA-display constructs were then eluted with 100 μl aliquots of TBKbuffer containing 100 μM FLAG-M2 peptide, 5 mg/ml BSA, and 0.1 mg/mlsalmon sperm DNA. The yield of mRNA-protein fusion product wasdetermined by scintillation counting the purified product and comparingit to an estimated specific activity of methionine based on anapproximate concentration of 51M in the lysate. For the librariescontaining a heterogeneous population of proteins, the prevalence ofmethionine was approximated as one initiator methionine per moleculeplus one for each 60 amino acids.

[0077] Target Protein Preparation

[0078] A portion of the human Bcl-X_(L) gene was PCR amplified from aGeneStorm® Expression-Ready Bcl-X_(L) clone (Invitrogen, Carlsbad,Calif.) using the primers AGTATCGAATTCATGTCTCAGAGCAACCGG (SEQ ID NO:150) and TACAGTCTCGAGCTAGTTGAAGCGTTCCTGGCCCT (SEQ ID NO: 151). The 644nucleotide Bcl-X_(L) DNA fragment obtained from the above PCR reactionwas then cloned into the expression vector 4T-1 (Pharmacia). CompetentE. coli (BL21 (DE3) pLysS) were transformed with the expression vectorand grown on LB/carbenicillin plates overnight at 37° C. A singletransformed colony was then selected and grown overnight in 5 ml ofLB/carbenicillin. Two ml of this starter culture was used to inoculate afresh 100 ml culture, which was grown at 37° C. until an OD₆₀₀ of 0.6was reached.

[0079] Expression of the Bcl-X_(L) polypeptide was induced in thebacterial culture by the addition of IPTG to a final concentration of0.4 mM, and the culture was shaken at 25° C. overnight. The bacterialcells were then harvested for their Bcl-X_(L) polypeptide bycentrifugation at 12,000 g for 30 minutes. The cell pellet wasresuspended in {fraction (1/10)} volume 100 mM Tris/HCl (pH 8.0)/100 mMNaCl/0.1% Triton X-100/1.0% glycerol, and the cells were lysed by douncehomogenization and three freeze/thaw cycles. The bacterial cell lysatewas clarified by centrifugation at 16,000 g for 30 minutes, and 5 ml ofthe clarified lysate was applied to a 2 ml RediPack glutathione column(Pharmacia). The column was washed with 20 ml of lysis buffer andeluted, in a stepwise manner, with lysis buffer to which reducedglutathione had been added, to final concentrations of 1, 5, 10, 15, and20 mM. Fractions of the eluate were analyzed on 4-12% NuPAGE gels(Novex) and positive fractions, based on polypeptide size, were pooled.The protein was dialyzed against 100 mM Tris/HCl (pH 8.0)/100 mMNaCl/0.05% Triton X-100/1.0% glycerol and the protein concentration wasdetermined by BCA assay (Pierce).

[0080] Assay to Detect Binding of a Polypeptide to a Bcl-X_(L)-GSTFusion Protein

[0081] Detection of a polypeptide binding to a Bcl-X_(L)-glutathioneS-transferase (GST) fusion protein was carried out as follows. Twentymicroliters of glutathione Sepharose 4B slurry (AP Biotech) wasaliquoted to a microcentrifuge tube and washed with PBS. TheBcl-X_(L)-GST fusion protein (60 μg), prepared as describe above, wasadded and allowed to bind to the Sepharose beads for 1 hr at 4° C. Thebeads were then re-washed in selection buffer (50 mM Tris-HCL pH 7.5,150 mM KCl, 0.05% Triton X-100, 0.5 mg/ml BSA, and 0.1 mg/ml salmonsperm DNA). The Bcl-X_(L)-GST beads were resuspended in 100 μl ofselection buffer (approximately 11.5 μM Bcl-X_(L)) and ³⁵S-labeled mRNAdisplay construct or free peptide was added (approximately 10-60 nM) andincubated on a rotator for 1 hr at 4° C. The reaction was thentransferred to a microcentrifuge column (BioRad) and unbound mRNAdisplay constructs or free peptides were removed by a 10 sec spin at1,000 rpm. The Sepharose beads were then washed three times with 500 μlof selection buffer. The extent of binding between the Bcl-X_(L)-GSTfusion protein and the mRNA display constructs or free peptides wasdetermined by scintillation counting each fraction, including therecovered beads.

[0082] Selection

[0083] A human Bcl-X_(L)-GST fusion protein was immobilized on Sepharosebeads as described above for the binding assay and incubated with themRNA display library. For the first round of selection, the input wasapproximately 0.06 pmol of each of the four source labeled librariesfrom human tissues (kidney, liver, bone marrow, and brain), which weremixed prior to selection. For subsequent rounds of selection, the inputof each of the four source labeled libraries ranged from 0.25 to 0.92pmol in total. After washing the beads of any unbound nucleicacid-protein fusion library members, the cDNA strand of the boundfusions were recovered in three elutions with 100 μl of 0.1N KOH.Eluates were subsequently neutralized by the addition of 2 μl of 1NTris-HCL pH 7 and 8 μl of 1N HCL. A small scale PCR optimization wasperformed with the eluate to determine the number of cycles required toproduce a strong signal without overamplification (typically 18-28cycles). The library was then regenerated by PCR using the remainder ofthe eluate.

[0084] Cloning and Sequencing of Library Members that Bind to theBcl-X_(L)-GST Fusion Protein

[0085] PCR products of the selected library members that bound to theBcl-X_(L)-GST fusion protein were cloned into the TOPO-TA vector(Invitrogen) and, after isolation of individual colonies, the plasmidswere purified (Qiagen) and sequenced using standard sequencingtechniques (Ausubel et al., supra).

[0086] In Vitro Synthesis of Polypeptides that Bind to the Bcl-X_(L)-GSTFusion Protein

[0087] To synthesize polypeptides that interacted with the Bcl-X_(L)-GSTfusion protein, RNAs were prepared from the PCR products of the selectedlibrary members that bound to the Bcl-X_(L)-GST fusion protein andpurified as described above. After translation in rabbit reticulocytelysate (Ambion), the peptides were purified directly from the lysate byimmunoprecipitation and peptide elution based on a C-terminal FLAG-M2epitope contained in the peptide (Sigma).

[0088] Detection of Known Bcl-X_(L)-Binding Polypeptides

[0089] Members of the Bcl-2 family of apoptotic proteins function viahomo- and heterodimerization, occurring primarily through the binding ofa single α-helix designated the BH3 domain (Bcl-2 Homology domain 3) ina corresponding pocket produced by three α-helices in the interactingpartner (Diaz et al., J. Biol. Chem. 272:11350-11355, 1997; and Sattleret al., Science 275:983-986, 1997). The target protein used herein wasthe human Bcl-X_(L) protein produced as a GST fusion and immobilized onglutathione Sepharose beads. The BH3 domains of three different Bcl-2family proteins (Bcl-2, Bax, and Bak) were prepared as mRNA displayconstructs, as described herein, along with control peptides derivedfrom unrelated proteins Stat-1 and Raf-1. The BH3 domains of Bcl-2, Bax,and Bak are shown with the consensus regions aligned and highlighted inFIG. 2A. Individual mRNA display constructs were incubated with eitherthe target Bcl-X_(L)-GST fusion protein bound to glutathione beads orwith the beads alone. Unbound materials were collected, and the beadswere washed. The amount of peptide bound to the beads was determined byscintillation counting and graphed as the percent of input counts bound(FIG. 2B).

[0090] Binding of Bcl-2, Bax, and Bak to Bcl-X_(L)-GST fusion proteinwas specific to the BH3 helices, with Bak binding most efficiently (40%)followed by Bax (6%); no binding was observed for the BH3 helix fromBcl-2 or either control. The ordering of Bax and Bak is in goodagreement with published IC₅₀. values which indicate that Bcl-X_(L) hasan affinity for the Bak BH3 domain that is approximately five-foldhigher than that for Bax (Diaz et al., supra). The lack of bindingobserved for the BH3 domain of Bcl-2 could be due to the BH3 domainpeptide failing to form a helix (Zhang et al., Biochem. Biophys. Res.Commun. 208:950-956, 1995; and Xie et al., Biochemistry 37:6410-6418,1998), or the affinity may be below that required to generate a signalin this assay.

[0091] Identification of Novel Bcl-X_(L)-Binding Polypeptides

[0092] Having established the binding of Bcl-X_(L) control peptides, asdescribed above, a selection to identify binders from within the complexmixture of an mRNA display library was initiated (FIG. 1). Fourlibraries, individually prepared from the tissue-specific mRNAs ofkidney, liver, bone marrow, and brain were pooled prior to initiatingselection. Each library contained a unique 8 nucleotide (nt) tag (sourcetag) within the 5′ UTR to allow specific amplification of an individuallibrary. The ability to mix libraries not only increased the size anddiversity of the starting pool, but the identification of tissue oforigin for each selected protein provided information similar to thatnormally obtained from mRNA expression analysis.

[0093] As a target for the selection, a GST fusion protein of Bcl-X_(L)was immobilized on glutathione Sepharose beads. The selection wasinitiated with a combined library of approximately 1.5×10 ¹¹ molecules.After incubation of the library with the target, unbound members of thelibrary were washed away and the bound material was eluted. An enrichedlibrary was then regenerated by PCR, transcription, ligation,translation, fusion, reverse-transcription, and purification. Thisenriched library was then used for the subsequent round of selection.

[0094] After four rounds of selection, the enriched pool from thecombined libraries bound the Bcl-X_(L) target at about 40%, an extentsimilar to the Bak control construct (see FIG. 2B). In order todetermine if the selected Bcl-X_(L) polypeptides originated from one ormultiple libraries, each library was prepared individually afterspecific amplification using library specific primers. The libraryconstructed from brain mRNA was omitted due to cross-reaction of the PCRamplification primer. A test of binding revealed that eachtissue-specific library bound to the target to an extent similar to themixed pool. The bound material from each of the individual libraries wasthen recovered by elution, PCR amplified, and analyzed by cloning andsequencing.

[0095] Additional rounds of selection may change the populationdistribution significantly. A rare sequence from the starting pool thatbinds tightly might be enriched only to the point of appearing onceamong the clones while a poorer binding sequence that was abundant inthe starting pool might still be found at high copy number. Also,sequencing more clones may lead to the identification of other proteinsstill present at low copy number.

[0096] Sequence Analysis of Bcl-X_(L) Binding Polypeptides

[0097] A total of 378 sequences were obtained from the above-describedbinding assay. Of the sequences, 181 were from the kidney library, 85were from the liver library, and 112 were from the bone marrow library.Initial analysis of the sequences revealed a total of 71 distinctsequence clusters. Six of the clusters (8%) originated from all threelibraries, 14 clusters (20%) originated from two of the three libraries,and the remaining 51 clusters (72%) originated from only one library.Many of the clusters contained a number of identical clones as well as avariety of clones with distinct 5′ or 3′ ends. This variety reflects therandom priming used to prepare the library and allowed minimalfunctional regions of the Bcl-X_(L)-binding polypeptides to bedelineated based on the overlapping regions of individual family members(FIG. 4). The sequences were then subjected to both nBLAST and pBLASTsearches to identify the proteins represented by each cluster.Thirty-six of the clones were from known polypeptides (SEQ ID NOS: 1-28,63-69, and 71), twenty-three of the clones were from hypothetical orunknown polypeptides whose nucleic acid sequences were found in thedatabase (SEQ ID NOS: 29-50, and 70), and twelve clones were uniquepolypeptide sequences (SEQ ID NOS: 51-62). These Bcl-X_(L)-bindingpolypeptide sequences are shown in FIG. 3B, and their correspondingnucleic acid sequences are shown underlined in FIG. 3C.

[0098] Twenty of the most frequently found Bcl-X_(L)-bindingpolypeptides are provided in Table 1. The number of clones in eachcluster was further broken down by the number containing the sourcelabel of each individual library (NF indicates none found among theclones sequenced). The identification number of the specific clone fromeach cluster chosen for further characterization is also indicated. Thenumbers present in Table 1 reflect the diversity of polypeptides thatinteract with other polypeptides attainable from large librariesgenerated by the in vitro methods of the invention. TABLE 1 Frequentlyfound BCl-X_(L) binding polypeptides Kidney Liver Marrow Total CloneProtein (181) (85) (112) (378) ID Bim 43  11  36  90  T44 HSPC300 9 15 11  35  C68 TPR, nuclear pore 23  NF NF 23  C55 complex-associatedprotein Bax 19  NF 3 22  C49 Novel Protein A 1 11  6 18  V18 cDNAFLJ23277, Clone 12  2 2 16  X42 HEP03322 Hypothetical protein NF 1 15 16  V47 DKFZp586HO623 Syntaxin 4A 8 4 NF 12  U58 Tumor protein HDCMB21P1 5 5 11  V50 Proline/Glutamine rich splicing 7 1 1 9 — factor NovelProtein B 3 5 NF 8 V68 Talin 4 NF 1 5 X56 Thyroid hormone 5 NF NF 5 U25receptor-associated protein Sterol regulatory element NF NF 5 5 W17binding txn factor Bcl-2 related proline-rich NF 2 3 5 Y75 protein BPRcDNA FLJ22171, clone NF NF 5 5 T42 HRC00654 Toll-like receptor 3 4 NF NF4 U15 Calpain 1 3 NF 4 V53 Bak 2 1 NF 3 C32 Novel protein D NF 1 2 3 T25

[0099] The most abundant Bcl-X_(L)-binding polypeptide (˜25% of thetotal) was that of Bim, which was originally identified as a partner ofBcl-2 in a protein interaction screen and subsequently shown to bind toBcl-X_(L) (O'Connor et al., EMBO J. 17:384-395, 1998). Two otherproteins out of the top twenty, Bak and Bax, contain BH3 domains knownto interact with Bcl-X_(L) (Diaz et al., supra). A fourth member of theBcl-2 family, BPR, was also found in this screen. This newly reportedmember of the Bcl-2 family was not present in the database during theinitial search. That a protein that was initially categorized as unknownis indeed a member of the Bcl-2 family reinforces the hypothesis thatother novel polypeptides identified in the screen may also be members ofthe Bcl-2 family. While initial reports indicate that BPR contains a BH2domain (Scorilas et al., unpublished, 2000), the present inventionindicates that it also contains a BH3 domain.

[0100] Further analysis of the known Bcl-X_(L)-binding polypeptides wasdone to determine whether each selected Bcl-X_(L)-binding polypeptidesequence was from the coding region or UTR and if the reading framematched that of the native protein. This analysis was used as a filterto eliminate false positives; polypeptides that failed at this step werenot further characterized. Twenty-seven out of the thirty-six clustersfrom known polypeptides were in frame and within their native ORFs.Three out of thirty-six, proline/glutamine rich splicing factor (SEQ IDNO: 63), UDP glucoronosyl transferase 2B4 precursor (SEQ ID NO: 71), andcDNA FLJ20617 (SEQ ID NO: 70) were from the incorrect reading frame. Twoclusters, transforming growth factor and arsenate resistance protein(SEQ ID NOS: 64 and 66, respectively), had inserts in the reversedorientation relative to the parent mRNA and probably arose due either toincomplete removal of the first strand primer after cDNA synthesis orre-priming on the cDNA strand after first strand synthesis. Anadditional four clusters were derived from reportedly noncoding regionsof the parent mRNA, that is, the 3′ UTR (L-plastin, K-ras oncogene,lysosomal pepstatin insensitive protease, and MYBPC3; SEQ ID NOS: 65,67, 68, and 69, respectively).

[0101]FIG. 4 shows an alignment of selected Bcl-X_(L)-bindingpolypeptides with their parental proteins, identified as describedabove. Each unique fragment was analyzed to determine the location ofthe amino and carboxyl termini within the parental protein sequence andthese amino acids are indicated by residue and number. The number ofisolated clones corresponding to each unique fragment was determined andis indicated next to the fragment ID. These fragments are mapped againstthe parental sequences of BimL, Bax, HSPC300, and TPR.

[0102] Affinity and Specificity of the Bcl-X_(L) Binding Polypeptides

[0103] The initial sequencing data showed the relative frequency of eachclone in the selected pool. Additional ranking of individual clones mayprovide valuable insight into the biological relevance of eachinteraction. For example, a binding affinity consistent with thecellular concentrations of the interacting proteins has been proposed asa litmus test for biological significance (Mayer, Mol. Biotechnol.13:201-213, 1999). The great flexibility and precise control over assayconditions, such as target concentration and the presence of additives,is one of the advantages of the in vitro selection methods of thepresent invention. By ranking the selected polypeptides based on readilyassayable characteristics, it is possible to quickly identify a subsetof polypeptides for assays that address the in vivo activity of theidentified polypeptides.

[0104] To determine the affinity of the selected Bcl-X_(L)-bindingpolypeptides, each cluster of selected sequences was aligned and theshortest sequence was generally chosen as representing the minimalbinding domain for that particular cluster. It should be noted that thisshortest fragment may represent only a partial binding sequence andlonger fragments may bind with higher affinity. The chosen clones wereprepared as free peptides and used in the binding assay described below.

[0105] Purified radioactively labeled protein from the individual cloneswas incubated with immobilized Bcl-X_(L)-GST for one hour and, afterwashing, the amount bound was determined by scintillation counting. Thebinding at each concentration was normalized to that at the highestconcentration and plotted versus concentration. FIG. 5 is arepresentational plot of the results of this binding assay. A selectedBak fragment (MGQVGRQLAIIGDDINRDYKDDDDKASA; SEQ ID NO: 152), containinga FLAG-M2 epitope, was synthetically produced as a free protein and usedin a binding assay in which the concentration of immobilizedBcl-X_(L)-GST was varied from 11 nM to 28 μM. The amount of peptidebound to Bcl-X_(L) was determined by scintillation counting andnormalized to that bound at the highest concentration. Normalizedbinding was then plotted versus Bcl-X_(L) concentration and fit to abinding curve using nonlinear regression. In this assay, all of theclones except one showed binding that was clearly dependent on targetconcentration. However, only binding curves that gave a high correlationcoefficient ® value) were used to determine an affinity.

[0106] Binding affinities of the free Bcl-X_(L)-binding polypeptides(i.e., Bcl-X_(L)-binding polypeptides that are not part of fusions)ranged from approximately 2 nM to 10 μM, demonstrating the great rangeof affinities accessible by in vitro selection. The twenty clones withthe highest affinity are presented in Table 2. The indicated clone fromeach sequence cluster was produced in vitro and the relative K_(d) wasdetermined for binding to Bcl-X_(L). The total number of clones in thatsequence cluster is indicated for comparison of affinity to abundance.TABLE 2 High Affinity Bcl-X_(L)-binding polypeptides clone AccessionK_(d) ID Protein Number (μM) Total clones T44 Bim NP_006529 0.002 90 T95Neutrophil cytosolic factor 2 NP_000424 0.00416 2 V47 Hypotheticalprotein DKFZp586ho623 NM_017540 0.0129 16 C21 Novel protein I — 0.07 3V18 Novel protein A — 0.086 3 X56 Talin (splice variant) NP_006280 0.0936 V72 unknown protein from clone 425C14 on chrom. 6q22 Z99129 0.28 1 C32Bak NP_001179 0.402 3 Y37 unknown protein from cDNA: FLJ21691, cloneAK025344 0.41 1 COL09555 Y75 Bcl-2 related protein BPR AF289220 0.42 5V06 Golgi SNAP receptor complex member 1 NP_004862 0.467 1 C68 HSCP300AF161418 0.58 35 U58 Syntaxin NP_004595 0.64 12 V50 Tumor proteinHDCMB21P NP_003286 0.69 11 C49 Bax NP_001179 0.76 22 U15 Toll-likereceptor 3 NP_003256 0.781 4 Y01 unknown protein from clone RP11-517O1on chrom. X AL355476 1.03 1 W06 Voltage dependent anion channel 3NP_005653 1.12 1 V68 Novel protein B — 1.16 8 T25 Novel protein D — 1.613

[0107] A comparison of K_(d) values of the Bcl-X_(L)-bindingpolypeptides (Table 2) to their frequency in the pool (Table 1) showed a65% overlap; of the twenty lowest Bcl-X_(L)-binding polypeptide K_(d)values, thirteen were found within the top twenty most abundantBcl-X_(L)-binding polypeptides, indicating a correlation between K_(d)and frequency. Five of the Bcl-X_(L)-binding polypeptides from the groupwith the twenty lowest K_(d) values, however, were observed only asingle time, emphasizing the importance of post-selectioncharacterization. Thus, the final representation of any givenpolypeptide within the selected pool may be determined by a number offactors: its abundance within the initial mRNA population used toprepare the library; the sum of efficiencies at each step in the mRNAdisplay process (PCR, transcription, translation, fusion, etc.); and itsaffinity to the target.

[0108] As the target used in this selection was a GST fusion protein ofBcl-X_(L), the specificity of each selected polypeptide was also testedby binding it to immobilized GST. The vast majority of Bcl-X_(L)-bindingpolypeptides exhibited background levels of binding (less than 2%) toGST. Of the eight proteins that bound more than 2% to GST, five boundeight to ten fold higher to the Bcl-X_(L)-GST fusion protein and so weredeemed specific. The three remaining proteins bound poorly to theBcl-X_(L)-GST fusion relative to GST alone and so were deemednon-specific.

[0109] Many Bcl-X_(L)-Binding Polypeptides Bind to the BH3 Domain ofBcl-X_(L)

[0110] As described above, the Bcl-2 family of proteins has been shownto form homo- and hetero-dimers through the binding of the BH3 domain ofone protein in the corresponding binding pocket on its partner. Onlythree of the selected proteins (Bim, Bak, and Bax) were previously knownto contain a BH3 domain. In order to determine if the other proteinsbound to the BH3 domain binding site on Bcl-X_(L), a competition assaywas performed. The Bak BH3 domain peptide used as a positive control wasprepared by chemical synthesis and used to compete with individualBcl-X_(L)-binding polypeptides in a Bcl-X_(L) binding assay. Theeffectiveness of this competition was demonstrated in a titration ofcompetitor concentration (FIG. 6). At a fixed concentration ofimmobilized Bcl-X_(L), the Bak BH3 domain-containing peptideMGQVGRQLAIIGDDINRDYKDDDDKASA (SEQ ID NO: 152), also containing a FLAG-M2epitope, was added at the indicated concentration along with a traceamount of a selected Talin fragment. After binding for 1 hour, theunbound material was removed and the bound protein was quantitated. Thebound protein was assayed by scintillation counting, normalized to thatbound in the absence of competitor, and plotted versus competitorconcentration.

[0111] A competition assay was performed for each of the selectedBcl-X_(L)-binding polypeptides using 20 μM Bak BH3 competitor based onthe titration shown in FIG. 6. Due to poor competition with theBcl-X_(L)-binding polypeptides having the lowest K_(d) values (asdetermined above) a second competition was performed for some of thesepolypeptides using 100 μM competitor (FIG. 3A). Each Bcl-X_(L)-bindingpolypeptide was incubated with immobilized Bcl-X_(L) in the presence ofcompetitor and the amount bound was normalized to a comparable reactionwithout competitor (FIG. 3A; see column labeled BakBH3 effect).

[0112] The Bcl-X_(L)-binding polypeptides were competed by the Bak BH3domain, indicating that they probably bind at the same site onBcl-X_(L). The alternative explanation, a decrease in binding of theselected polypeptide at one site, due to a change in conformation of thetarget Bcl-X_(L) upon binding the competitor at a different site, wasnot tested in this assay. Only three of the selected proteins (clonex42, encoding SEQ ID NO: 35; clone t53, encoding SEQ ID NO: 25; andclone and w75, encoding SEQ ID NO: 37) were not competed at all by theBH3 domain, indicating that they may bind to a different site onBcl-X_(L).

[0113] Alignment of Selected Bcl-X_(L)-Binding Polypeptides

[0114] Competition for binding with the Bak BH3 domain indicated thatmost of the Bcl-X_(L)-binding polypeptides that were selected werebinding at the same site. Therefore, each of the polypeptides wasexamined for the presence of a BH3 domain sequence. A tentativeassignment could be made for most polypeptides. The Bcl-X_(L)-bindingpolypeptides with the highest affinity (Table 2) are shown in FIG. 7,aligned by sequence homology, where possible, to the known BH3 domainsof Bim, Bak, and Bax. Most of the polypeptides have the hallmarkperiodicity of hydrophobic amino acids indicative of an amphipathicalpha helix. Additional homologies among the sequences are indicated byshading.

[0115] Additional Selection Experiments

[0116] Another selection to identify Bcl-X_(L)-GST fusion proteinbinders from mRNA display libraries prepared from tissue specific mRNAsof human bone marrow, brain, hippocampus, and thymus was initiated. Eachlibrary contained a unique 8 nucleotide source tag within the 5′ UTR toallow specific amplification of an individual library. The source tagsAACTCCTC (SEQ ID NO: 246), AATCTACC (SEQ ID NO: 247), AACAACAC (SEQ IDNO: 248), and AATATTCC (SEQ ID NO: 249) were used for the librariesderived from mRNA from human bone marrow, brain hippocampus, and thymus,respectively. Prior to initiating the selection, the libraries werepooled.

[0117] After five rounds of selection, each library was preparedindividually after specific amplification using library specific primersand analyzed by cloning and sequencing. A total of 10 distinct sequenceclusters were identified, of which 2 (Bim and Bax) were alreadyidentified in the previous selection. The unique sequences are shown inFIGS. 8A and 8B, and their corresponding nucleic acid sequences in FIG.8C. Sequences of three of the clones were from known polypeptides (SEQID NOS: 224-226), sequences of two of the clones were from hypotheticalor unknown polypeptides whose nucleic acid sequences were found in thedatabase (SEQ ID: 227 and 228), and sequences of two of the clones wereunique polypeptide sequences (SEQ ID: 229 and 230). All of the selectedBcl-X_(L)-binding polypeptide sequences were from the coding region ofthe native protein.

[0118] The following selected polypeptides that interacted with theBcl-X_(L)-GST fusion protein were synthesized and purified as described:SRP9 (clone AttB-Hc-6) and Bmf (clone AttB-Thy-34), which were unique tothis selection and Bax (clone AttB-Hc-7) as a positive control forbinding to the Bcl-X_(L)-GST fusion protein. The purified polypeptideswere assayed for binding to GST and to the Bcl-X_(L)-GST fusion protien(FIG. 9). Binding of Bax to the Bcl-X_(L)-GST fusion protein was themost efficient (32%), followed by Bmf (6%) and SRP9 (0.65%). Binding ofall three purified polypeptides to GST were very low, with bindingpercentages not higher than 0.25%.

[0119] High-Throughput Identification of Protein-Protein Interactions

[0120] All of the procedures described above were essentiallymicrocentrifuge tube based. Such systems are readily scalable throughthe use of microtiter techniques and are amenable to automation. Inaddition, the relatively laborious step of sequencing can besupplemented or replaced by array-based analysis of the pool, using, forexample, Gene Discovery Arrays/Life Grids (Incyte Genomics, Palo Alto,Calif.) according to the manufacturer's instructions. Thesemodifications to mRNA display technology enable its application tohigh-throughput, genome-wide identification of protein-proteininteractions.

[0121] Cloning Full Length Nucleic Acid Molecules EncodingBcl-X_(L)-Binding Polypeptides

[0122] Nucleic acid molecules encoding the full length polypeptidesequences of the identified Bcl-X_(L)-binding polypeptides can readilybe cloned using standard hybridization or PCR cloning techniques and DNAfrom the source (as determined by the source label), for example, asdescribed in Ausubel et al. (supra). An exemplary method for obtainingthe full length polypeptide sequences employs, a standard nested PCRstrategy that can be used with gene-specific (obtained from the nucleicacid sequence encoding the Bcl-X_(L)-binding polypeptide) and flankingadaptors from double stranded cDNA prepared from the source of theidentified Bcl-X_(L)-binding polypeptide. In addition, 5′ flankingsequence can be obtained using 5′ RACE techniques known to those ofskill in the art.

[0123] Synthesis of Bcl-X_(L)-Binding Polypeptides

[0124] Additional characteristics of the Bcl-X_(L)-binding polypeptidesmay be analyzed by synthesizing the polypeptides in various cell typesor in vitro systems. The function of Bcl-X_(L)-binding polypeptides maythen be examined under different physiological conditions.Alternatively, cell lines may be produced which over-express the nucleicacid encoding a Bcl-X_(L)-binding polypeptide, allowing purification ofa Bcl-X_(L)-binding polypeptide for biochemical characterization,large-scale production, antibody production, or patient therapy.

[0125] For polypeptide expression, eukaryotic and prokaryotic expressionsystems may be generated in which nucleic acid sequences encodingBcl-X_(L)-binding polypeptides are introduced into a plasmid or othervector, which is then used to transform living cells. Constructs inwhich the nucleic acid sequences are inserted in the correct orientationinto an expression plasmid may be used for protein expression.Alternatively, portions of gene sequences encoding the Bcl-X_(L)-bindingpolypeptide, including wild-type or mutant Bcl-X_(L)-binding polypeptidesequences, may be inserted. Prokaryotic and eukaryotic expressionsystems allow various important functional domains of theBcl-X_(L)-binding polypeptides to be recovered, if desired, as fusionproteins, and then used for binding, structural, and functional studiesand also for the generation of appropriate antibodies. IfBcl-X_(L)-binding polypeptide expression induces terminaldifferentiation in some types of cells, it may be desirable to expressthe protein under the control of an inducible promoter in those cells.

[0126] Standard expression vectors contain promoters that direct thesynthesis of large amounts of mRNA corresponding to the inserted nucleicacid encoding a Bcl-X_(L)-binding polypeptide in the plasmid-bearingcells. They may also include eukaryotic or prokaryotic origin ofreplication sequences allowing for their autonomous replication withinthe host organism, sequences that encode genetic traits that allowvector-containing cells to be selected in the presence of otherwisetoxic drugs, and sequences that increase the efficiency with which thesynthesized mRNA is translated. Stable long-term vectors may bemaintained as freely replicating entities by using regulatory elementsof, for example, viruses (e.g., the OriP sequences from the Epstein BarrVirus genome). Cell lines may also be produced that have integrated thevector into the genomic DNA, and in this manner the gene product isproduced on a continuous basis.

[0127] Expression of foreign sequences in bacteria such as Escherichiacoli requires the insertion of the nucleic acid sequence encoding aBcl-X_(L)-binding polypeptide into a bacterial expression vector. Suchplasmid vectors contain several elements required for the propagation ofthe plasmid in bacteria, and for expression of the DNA inserted into theplasmid. Propagation of only plasmid-bearing bacteria is achieved byintroducing, into the plasmid, selectable marker-encoding sequences thatallow plasmid-bearing bacteria to grow in the presence of otherwisetoxic drugs. The plasmid also contains a transcriptional promotercapable of producing large amounts of mRNA from the cloned gene. Suchpromoters may be (but are not necessarily) inducible promoters thatinitiate transcription upon induction. The plasmid also preferablycontains a polylinker to simplify insertion of the gene in the correctorientation within the vector.

[0128] Once the appropriate expression vectors containing a nucleic acidsequence encoding a Bcl-X_(L)-binding polypeptide, or fragment, fusion,or mutant thereof, are constructed, they are introduced into anappropriate host cell by transformation techniques, including calciumphosphate transfection, DEAE-dextran transfection, electroporation,microinjection, protoplast fusion, and liposome-mediated transfection.The host cells that are transfected with the vectors of this inventionmay include (but are not limited to) E. coli or other bacteria, yeast,fungi, insect cells (using, for example, baculoviral vectors forexpression), or cells derived from mice, humans, or other animals.Mammalian cells can also be used to express the Bcl-X_(L)-bindingpolypeptides using, for example, a vaccinia virus expression systemdescribed, for example, in Ausubel et al. (supra).

[0129] Expression of Bcl-X_(L)-binding polypeptides, fusions,polypeptide fragments, or mutants encoded by cloned DNA is also possibleusing, for example, the T7 late-promoter expression system. This systemdepends on the regulated expression of T7 RNA polymerase, an enzymeencoded in the DNA of bacteriophage T7. The T7 RNA polymerase initiatestranscription at a specific 23-bp promoter sequence called the T7 latepromoter. Copies of the T7 late promoter are located at several sites onthe T7 genome, but none is present in E. coli chromosomal DNA. As aresult, in T7-infected E. coli cells, T7 RNA polymerase catalyzestranscription of viral genes but not of E. coli genes. In thisexpression system, recombinant E. coli cells are first engineered tocarry the gene encoding T7 RNA polymerase next to the lac promoter. Inthe presence of IPTG, these cells transcribe the T7 polymerase gene at ahigh rate and synthesize abundant amounts of T7 RNA polymerase. Thesecells are then transformed with plasmid vectors that carry a copy of theT7 late promoter protein. When IPTG is added to the culture mediumcontaining these transformed E. coli cells, large amounts of T7 RNApolymerase are produced. The polymerase then binds to the T7 latepromoter on the plasmid expression vectors, catalyzing transcription ofthe inserted cDNA at a high rate. Since each E. coli cell contains manycopies of the expression vector, large amounts of mRNA corresponding tothe cloned cDNA can be produced in this system. The resulting proteincan be radioactively labeled. Plasmid vectors containing late promotersand the corresponding RNA polymerases from related bacteriophages suchas T3, T5, and SP6 may also be used for production of proteins fromcloned DNA. E. coli can also be used for expression using an M13 phagesuch as mGPI-2. Furthermore, vectors that contain phage lambdaregulatory sequences, or vectors that direct the expression of fusionproteins, for example, a maltose-binding protein fusion protein or aglutathione-S-transferase fusion protein, also may be used forexpression in E. coli.

[0130] Eukaryotic expression systems are useful for obtainingappropriate post-translational modification of expressed polypeptides.Transient transfection of a eukaryotic expression plasmid allows thetransient production of Bcl-X_(L)-binding polypeptides by a transfectedhost cell. Bcl-X_(L)-binding polypeptides may also be produced by astably-transfected mammalian cell line. A number of vectors suitable forstable transfection of mammalian cells are available to the public(e.g., see Pouwels et al., Cloning Vectors: A Laboratory Manual, 1985,Supp. 1987), as are methods for constructing such cell lines (see e.g.,Ausubel et al., supra). In one example, a nucleic acid molecule encodinga Bcl-X_(L)-binding polypeptide, fusion, mutant, or polypeptide fragmentis cloned into an expression vector that includes the dihydrofolatereductase (DHFR) gene. Integration of the plasmid and, therefore,integration of the nucleic acid sequence encoding the Bcl-X_(L)-bindingpolypeptide into the host cell chromosome is selected for by inclusionof 0.01-300 μM methotrexate in the cell culture medium (as described,for example in Ausubel et al., supra). This dominant selection can beaccomplished in most cell types. Recombinant protein expression can beincreased by DHFR-mediated amplification of the transfected gene.Methods for selecting cell lines bearing gene amplifications aredescribed in Ausubel et al. (supra). These methods generally involveextended culture in medium containing gradually increasing levels ofmethotrexate. The most commonly used DHFR-containing expression vectorsare pCVSEII-DHFR and pAdD26SV(A) (described, for example, in Ausubel etal., supra). The host cells described above or, preferably, aDHFR-deficient CHO cell line (e.g., CHO DHFR cells, ATCC Accession No.CRL 9096) are among those most preferred for DHFR selection of astably-transfected cell line or DHFR-mediated gene amplification.

[0131] Eukaryotic cell expression of Bcl-X_(L)-binding polypeptidesfacilitates studies of the gene and gene products encodingBcl-X_(L)-binding polypeptides, including determination of properexpression and post-translational modifications for biological activity,identifying regulatory elements located in the 5′, 3′, and intronregions of nucleic acid molecules encoding Bcl-X_(L)-bindingpolypeptides and their roles in tissue regulation of Bcl-X_(L)-bindingpolypeptide expression. It also permits the production of large amountsof normal and mutant proteins for isolation and purification, and theuse of cells expressing Bcl-X_(L)-binding polypeptides as a functionalassay system for antibodies generated against the protein. Eukaryoticcells expressing Bcl-X_(L)-binding polypeptides may also be used to testthe effectiveness of pharmacological agents on apoptotic diseases or asmeans by which to study Bcl-X_(L)-binding polypeptides as components ofa transcriptional activation system. Expression of Bcl-X_(L)-bindingpolypeptides, fusions, mutants, and polypeptide fragments in eukaryoticcells also enables the study of the function of the normal completepolypeptide, specific portions of the polypeptide, or of naturallyoccurring polymorphisms and artificially-produced mutated polypeptides.The DNA sequences encoding Bcl-X_(L)-binding polypeptides can be alteredusing procedures known in the art, such as restriction endonucleasedigestion, DNA polymerase fill-in, exonuclease deletion, terminaldeoxynucleotide transferase extension, ligation of synthetic or clonedDNA sequences, and site-directed sequence alteration using specificoligonucleotides together with PCR.

[0132] Another preferred eukaryotic expression system is the baculovirussystem using, for example, the vector pBacPAK9, which is available fromClontech (Palo Alto, Calif.). If desired, this system may be used inconjunction with other protein expression techniques, for example, themyc tag approach described by Evan et al. (Mol. Cell Biol. 5:3610-3616,1985).

[0133] Once the recombinant protein is expressed, it can be isolatedfrom the expressing cells by cell lysis followed by protein purificationtechniques, such as affinity chromatography. In this example, ananti-Bcl-X_(L)-binding polypeptide antibody, which may be produced bythe methods described herein, can be attached to a column and used toisolate the recombinant Bcl-X_(L)-binding polypeptides. Lysis andfractionation of Bcl-X_(L)-binding polypeptide-harboring cells prior toaffinity chromatography may be performed by standard methods (see e.g.,Ausubel et al. (supra). Once isolated, the recombinant protein can, ifdesired, be purified further, e.g., by high performance liquidchromatography (HPLC; e.g., see Fisher, Laboratory Techniques InBiochemistry And Molecular Biology, Work and Burdon, Eds., Elsevier,1980).

[0134] Polypeptides of the invention, particularly shortBcl-X_(L)-binding fragments, can also be produced by chemical synthesis(e.g., by the methods described in Solid Phase Peptide Synthesis, 2nded., 1984, The Pierce Chemical Co., Rockford, Ill.). These generaltechniques of polypeptide expression and purification can also be usedto produce and isolate useful Bcl-X_(L)-binding polypeptide fragments oranalogs, as described herein.

[0135] Those skilled in the art of molecular biology will understandthat a wide variety of expression systems may be used to produce therecombinant Bcl-X_(L)-binding polypeptides. The precise host cell usedis not critical to the invention. The Bcl-X_(L)-binding polypeptides maybe produced in a prokaryotic host (e.g., E. coli) or in a eukaryotichost (e.g., S. cerevisiae, insect cells such as Sf9 cells, or mammaliancells such as COS-1, NIH 3T3, or HeLa cells). These cells arecommercially available from, for example, the American Type CultureCollection, Rockville, Md. (see also Ausubel et al., supra). The methodof transformation and the choice of expression vehicle (e.g., expressionvector) will depend on the host system selected. Transformation andtransfection methods are described, e.g., in Ausubel et al. (supra) andexpression vehicles may be chosen from those provided, e.g., in Pouwelset al., Cloning Vectors: A Laboratory Manual, 1985, Supp. 1987.

[0136] In addition, prokaryotic and eukaryotic in vitro systems can beutilized for the generation of Bcl-X_(L)-binding polypeptides. Suchmethods are described, for example by Ausubel et al.(supra). Proteinscan be synthesized using, for example, in vitro transcription andtranslation methods. Rabbit reticulocyte lysates, wheat germ lysates, orE. coli lysates can be used to translate exogenous mRNAs from a varietyf eukaryotic and prokaryotic sources. Kits for the in vitro productionof polypeptides are available, for example, from Ambion (Austin, Tex.).

[0137] Bcl-X_(L)-Binding Polypeptide Fragments

[0138] Polypeptide fragments that incorporate various portions ofBcl-X_(L)-binding polypeptides are useful in identifying the domains oramino acids important for the biological activities of Bcl-X_(L)-bindingpolypeptides, and the present invention helps to identify these criticaldomains (FIG. 4). Methods for generating such fragments are well knownin the art (see, for example, Ausubel et al. (supra)) using thenucleotide sequences provided herein. For example, a Bcl-X_(L)-bindingpolypeptide fragment may be generated by PCR amplifying the desiredfragment using oligonucleotide primers designed based upon the nucleicacid sequences encoding Bcl-X_(L)-binding polypeptides. Preferably, theoligonucleotide primers include unique restriction enzyme sites thatfacilitate insertion of the fragment into the cloning site of amammalian expression vector. This vector may then be introduced into amammalian cell by artifice by the various techniques known in the artand described herein, resulting in the production of a Bcl-X_(L)-bindingpolypeptide gene fragment.

[0139] Bcl-X_(L)-binding polypeptide fragments will be useful inevaluating the portions of the polypeptide involved in importantbiological activities, such as protein-protein interactions. Thesefragments may be used alone, or as chimeric fusion proteins.Bcl-X_(L)-binding polypeptide fragments may also be used to raiseantibodies specific for various regions of Bcl-X_(L)-bindingpolypeptides. Any portion of the Bcl-X_(L)-binding polypeptide aminoacid sequence may be used to generate antibodies.

[0140] Bcl-X_(L)-Binding Polypeptide Antibodies

[0141] In order to prepare polyclonal antibodies, Bcl-X_(L)-bindingpolypeptides, fragments of Bcl-X_(L)-binding polypeptides, or fusionpolypeptides containing defined portions of Bcl-X_(L)-bindingpolypeptides may be synthesized in bacteria by expression ofcorresponding DNA sequences in a suitable cloning vehicle. Fusionproteins are commonly used as a source of antigen for producingantibodies. Two widely used expression systems for E. coli are lacZfusions using the pUR series of vectors and trpE fusions using the pATHvectors. The proteins can be purified, and then coupled to a carrierprotein and mixed with Freund's adjuvant (to enhance stimulation of theantigenic response in an innoculated animal) and injected into rabbitsor other laboratory animals. Alternatively, protein can be isolated fromBcl-X_(L)-binding polypeptide-expressing cultured cells. Followingbooster injections at bi-weekly intervals, the rabbits or otherlaboratory animals are then bled and the sera isolated. The sera can beused directly or can be purified prior to use by various methods,including affinity chromatography employing reagents such as ProteinA-Sepharose, antigen-Sepharose, and anti-mouse-Ig-Sepharose. The seracan then be used to probe protein extracts from Bcl-X_(L)-bindingpolypeptide-expressing tissue electrophoretically fractionated on apolyacrylamide gel to identify Bcl-X_(L)-binding polypeptides.Alternatively, synthetic peptides can be made that correspond to theantigenic portions of the protein and used to innoculate the animals.

[0142] In order to generate a peptide for use in making, for example,Bcl-X_(L)-binding polypeptide-specific antibodies, a Bcl-X_(L)-bindingpolypeptide sequence may be expressed as a C-terminal fusion withglutathione S-transferase (GST; Smith et al., Gene 67:31-40, 1988). Thefusion protein may be purified on glutathione-Sepharose beads, elutedwith glutathione, cleaved with thrombin (at the engineered cleavagesite), and purified to the degree required to successfully immunizerabbits. Primary immunizations may be carried out with Freund's completeadjuvant and subsequent immunizations performed with Freund's incompleteadjuvant. Antibody titers are monitored by Western blot andimmunoprecipitation analyses using the thrombin-cleavedBcl-X_(L)-binding polypeptide fragment of the Bcl-X_(L)-binding-GSTfusion polypeptide. Immune sera are affinity purified usingCNBr-Sepharose-coupled Bcl-X_(L)-binding polypeptide. Antiserumspecificity may be determined using a panel of unrelated GST fusionproteins.

[0143] Alternatively, monoclonal Bcl-X_(L)-binding polypeptideantibodies may also be produced by using, as an antigen, aBcl-X_(L)-binding polypeptide isolated from Bcl-X_(L)-bindingpolypeptide-expressing cultured cells or Bcl-X_(L)-binding polypeptideisolated from tissues. The cell extracts, or recombinant proteinextracts containing Bcl-X_(L)-binding polypeptide, may, for example, beinjected with Freund's adjuvant into mice. Several days after beinginjected, the mouse spleens are removed, the tissues are disaggregated,and the spleen cells are suspended in phosphate buffered saline (PBS).The spleen cells serve as a source of lymphocytes, some of which areproducing antibody of the appropriate specificity. These are then fusedwith permanently growing myeloma partner cells, and the products of thefusion are plated into a number of tissue culture wells in the presenceof a selective agent such as hypoxanthine, aminopterine, and thymidine(HAT). The wells are then screened by ELISA to identify those containingcells making antibody capable of binding a Bcl-X_(L)-binding polypeptideor polypeptide fragment or mutant thereof. These are then re-plated andafter a period of growth, these wells are again screened to identifyantibody-producing cells. Several cloning procedures are carried outuntil over 90% of the wells contain single clones that are positive forantibody production. From this procedure a stable line of clones thatproduce the antibody is established. The monoclonal antibody can then bepurified by affinity chromatography using Protein A Sepharose,ion-exchange chromatography, as well as variations and combinations ofthese techniques. Truncated versions of monoclonal antibodies may alsobe produced by recombinant methods in which plasmids are generated thatexpress the desired monoclonal antibody fragment(s) in a suitable host.

[0144] As an alternate or adjunct immunogen to GST fusion proteins,peptides corresponding to relatively unique hydrophilic regions ofBcl-X_(L)-binding polypeptide may be generated and coupled to keyholelimpet hemocyanin (KLH) through an introduced C-terminal lysine.Antiserum to each of these peptides is similarly affinity-purified onpeptides conjugated to BSA, and specificity is tested by ELISA andWestern blotting using peptide conjugates, and by Western blotting andimmunoprecipitation using Bcl-X_(L)-binding polypeptide, for example,expressed as a GST fusion protein.

[0145] Alternatively, monoclonal antibodies may be prepared using theBcl-X_(L)-binding polypeptides described above and standard hybridomatechnology (see, e.g., Kohler et al., Nature 256:495, 1975; Kohler etal., Eur. J. Immunol. 6:511, 1976; Kohler et al., Eur. J. Immunol.6:292, 1976; Hammerling et al., In Monoclonal Antibodies and T CellHybridomas, Elsevier, New York, N.Y., 1981; and Ausubel et al. (supra)).Once produced, monoclonal antibodies are also tested for specificBcl-X_(L)-binding polypeptide recognition by Western blot orimmunoprecipitation analysis (by the methods described in Ausubel etal., supra).

[0146] Monoclonal and polyclonal antibodies that specifically recognizea Bcl-X_(L)-binding polypeptide (or fragments thereof), such as thosedescribed herein, are considered useful in the invention. Antibodiesthat inhibit the activity of a Bcl-X_(L)-binding polypeptide describedherein may be especially useful in preventing or slowing the developmentof a disease caused by inappropriate expression of a wild type or mutantBcl-X_(L)-binding polypeptide.

[0147] Antibodies of the invention may be produced usingBcl-X_(L)-binding amino acid sequences that do not reside within highlyconserved regions, and that appear likely to be antigenic, as analyzedby criteria such as those provided by the Peptide Structure Program(Genetics Computer Group Sequence Analysis Package, Program Manual forthe GCG Package, Version 7, 1991) using the algorithm of Jameson andWolf(CABIOS 4:181, 1988). These fragments can be generated by standardtechniques, e.g., by PCR, and cloned into the pGEX expression vector(Ausubel et al., supra). GST fusion proteins are expressed in E. coliand purified using a glutathione-agarose affinity matrix as described inAusubel et al., supra). To generate rabbit polyclonal antibodies, and tominimize the potential for obtaining antisera that is non-specific, orexhibits low-affinity binding to a Bcl-X_(L)-binding polypeptide, two orthree fusions are generated for each protein, and each fusion isinjected into at least two rabbits. Antisera are raised by injections inseries, preferably including at least three booster injections.

[0148] In addition to intact monoclonal and polyclonalanti-Bcl-X_(L)-binding polypeptide antibodies, the invention featuresvarious genetically engineered antibodies, humanized antibodies, andantibody fragments, including F(ab′)2, Fab′, Fab, Fv, and sFv fragments.Antibodies can be humanized by methods known in the art, e.g.,monoclonal antibodies with a desired binding specificity can becommercially humanized (Scotgene, Scotland; Oxford Molecular, Palo Alto,Calif.). Fully human antibodies, such as those expressed in transgenicanimals, are also features of the invention (Green et al., NatureGenetics 7:13-21, 1994).

[0149] Ladner (U.S. Pat. Nos. 4,946,778 and 4,704,692) describes methodsfor preparing single polypeptide chain antibodies. Ward et al. (Nature341:544-546, 1989) describe the preparation of heavy chain variabledomains, which they term “single domain antibodies,” that have highantigen-binding affinities. McCafferty et al. (Nature 348:552-554, 1990)show that complete antibody V domains can be displayed on the surface offd bacteriophage, that the phage bind specifically to antigen, and thatrare phage (one in a million) can be isolated after affinitychromatography. Boss et al. (U.S. Pat. No. 4,816,397) describe variousmethods for producing immunoglobulins, and immunologically functionalfragments thereof, which include at least the variable domains of theheavy and light chain in a single host cell. Cabilly et al. (U.S. Pat.No. 4,816,567) describe methods for preparing chimeric antibodies.

[0150] Affinity reagents or polypeptides from randomized polypeptidelibraries that bind tightly to a desired polypeptides, for example,Bcl-X_(L)-binding polypeptides, fragments of Bcl-X_(L)-bindingpolypeptides, or fusion polypeptides containing defined portions ofBcl-X_(L)-binding polypeptides can also be obtained, using methods knownto one skilled in the art. In addition, polypeptide affinity scaffoldsmay be used to bind a polypeptide of interest or to identify or optimizea polypeptide that binds to a polypeptide of interest, for example,Bcl-X_(L)-binding polypeptides, fragments of Bcl-X_(L)-bindingpolypeptides, or fusion polypeptides containing defined portions ofBcl-X_(L)-binding polypeptides. Such methods are described for exampleby Lipovsek et al. (WO 00/34784), hereby incorporated by reference.

[0151] Identification of Additional Bcl-X_(L)-Binding Polypeptide Genes

[0152] Standard techniques, such as the polymerase chain reaction (PCR)and DNA hybridization, may be used to clone Bcl-X_(L)-bindingpolypeptide homologues in other species and Bcl-X_(L)-bindingpolypeptide-related genes in humans.Bcl-X_(L)-binding-polypeptide-related genes and homologues may bereadily identified using low-stringency DNA hybridization orlow-stringency PCR with human Bcl-X_(L)-binding polypeptide probes orprimers. Degenerate primers encoding human Bcl-X_(L)-bindingpolypeptides or human Bcl-X_(L)-binding polypeptide-related amino acidsequences may be used to clone additional Bcl-X_(L)-bindingpolypeptide-related genes and homologues by RT-PCR.

[0153] Alternatively, additional Bcl-X_(L)-binding polypeptides can beidentified by utilizing consensus sequence information forBcl-X_(L)-binding polypeptides to search for similar polypeptides. Forexample, polypeptide databases can be searched for proteins with theamphipathic alpha helix motif described above in Example 7. Candidatepolypeptides containing such a motif can then be tested for theirBcl-X_(L)-binding properties, using methods described herein.

[0154] Assays for Compounds that Modulate Bcl-X_(L)-Binding PolypeptideBiological Activity

[0155] Bcl-X_(L)-binding polypeptide biological activity may bemodulated in a number of different ways. For example, cellularconcentrations of Bcl-X_(L)-binding polypeptides of can be altered,which would, in turn, affect overall Bcl-X_(L)-binding polypeptidebiological activity. This is achieved, for example, by administering toa cell a compound that alters the concentration and/or activity of aBcl-X_(L)-binding polypeptide.

[0156] We have shown herein that a number of polypeptides bind aBcl-X_(L) polypeptide. Accordingly, compounds that modulateBcl-X_(L)-binding polypeptide biological activity may be identifiedusing any of the methods, described herein (or any analogous methodknown in the art), for measuring protein-protein interactions involvinga Bcl-X_(L)-binding polypeptide. For example, theBcl-X_(L)/Bcl-X_(L)-binding polypeptide assays described above may beused to determine whether the addition of a test compound increases ordecreases binding activity of any (wild-type or mutant)Bcl-X_(L)-binding polypeptide to Bcl-X_(L). A compound that increases ordecreases the binding activity of a mutant Bcl-X_(L)-binding polypeptidemay be useful for treating a Bcl-X_(L)-binding polypeptide-relateddisease, such as an apoptotic or proliferative disease. A compound thatmodulates Bcl-X_(L)-binding polypeptide biological activity may act bybinding to either a Bcl-X_(L)-binding polypeptide or to Bcl-X_(L)itself, thereby reducing or preventing the biological activity of theBcl-X_(L)-binding polypeptide.

[0157] Levels of Bcl-X_(L)-binding polypeptide may be modulated bymodulating transcription, translation, or mRNA or protein turnover; suchmodulation may be detected using known methods for measuring mRNA andprotein levels, e.g., RT-PCR and ELISA.

[0158] Test Compounds

[0159] In general, drugs for modulation of Bcl-X_(L)-binding polypeptidebiological activity may be identified from large libraries of naturalproducts or synthetic (or semi-synthetic) extracts or chemical librariesaccording to methods known in the art. Those skilled in the field ofdrug discovery and development will understand that the precise sourceof test extracts or compounds is not critical to the screeningprocedure(s) of the invention. Accordingly, virtually any number ofchemical extracts or compounds can be screened using the exemplarymethods described herein. Examples of such extracts or compoundsinclude, but are not limited to, plant-, fungal-, prokaryotic- oranimal-based extracts, fermentation broths, and synthetic compounds, aswell as modification of existing compounds. Numerous methods are alsoavailable for generating random or directed synthesis (e.g.,semi-synthesis or total synthesis) of any number of chemical compounds,including, but not limited to, saccharide-, lipid-, peptide-, andnucleic acid-based compounds. Synthetic compound libraries arecommercially available, e.g., from Brandon Associates (Merrimack, N.H.)and Aldrich Chemical (Milwaukee, Wis.). Alternatively, libraries ofnatural compounds in the form of bacterial, fungal, plant, and animalextracts are commercially available from a number of sources, includingBiotics (Sussex, UK), Xenova (Slough, UK), Harbor Branch OceangraphicsInstitute (Ft. Pierce, Fla.), and PharmaMar, U.S.A. (Cambridge, Mass.).In addition, natural and synthetically produced libraries are generated,if desired, according to methods known in the art, e.g., by standardextraction and fractionation methods. Furthermore, if desired, anylibrary or compound is readily modified using standard chemical,physical, or biochemical methods.

[0160] In addition, those skilled in the art of drug discovery anddevelopment readily understand that methods for dereplication (e.g.,taxonomic dereplication, biological dereplication, and chemicaldereplication, or any combination thereof) or the elimination ofreplicates or repeats of materials already known for theirBcl-X_(L)-binding polypeptide-modulatory activities should be employedwhenever possible.

[0161] When a crude extract is found to modulate (i.e., stimulate orinhibit) Bcl-X_(L)-binding polypeptide biological activity, furtherfractionation of the positive lead extract is necessary to isolatechemical constituents responsible for the observed effect. Thus, thegoal of the extraction, fractionation, and purification process is thecareful characterization and identification of a chemical entity withinthe crude extract having an activity that stimulates or inhibitsBcl-X_(L)-binding polypeptide biological activity. The same assaysdescribed herein for the detection of activities in mixtures ofcompounds can be used to purify the active component and to testderivatives thereof. Methods of fractionation and purification of suchheterogenous extracts are known in the art. If desired, compounds shownto be useful agents for treatment are chemically modified according tomethods known in the art. Compounds identified as being of therapeuticvalue may be subsequently analyzed using animal models for diseases inwhich it is desirable to increase or decrease Bcl-X_(L)-bindingpolypeptide biological activity.

[0162] Construction of Transgenic Animals and Knockout Animals

[0163] Characterization of Bcl-X_(L)-binding polypeptide genes providesinformation that allows Bcl-X_(L)-binding polypeptide knockout animalmodels to be developed by homologous recombination. Similarly, animalmodels of Bcl-X_(L)-binding polypeptide overproduction may be generatedby integrating one or more Bcl-X_(L)-binding polypeptide sequences intothe genome, according to standard transgenic techniques. Moreover, theeffect of Bcl-X_(L)-binding polypeptide gene mutations (e.g., dominantgene mutations) may be studied using transgenic mice carrying mutatedBcl-X_(L)-binding polypeptide transgenes or by introducing suchmutations into the endogenous Bcl-X_(L)-binding polypeptide gene, usingstandard homologous recombination techniques.

[0164] Bcl-X_(L)-binding polypeptide knockout mice provide a tool forstudying the role of Bcl-X_(L)-binding polypeptide in embryonicdevelopment and in disease. Moreover, such mice provide the means, invivo, for testing therapeutic compounds for amelioration of diseases orconditions involving a Bcl-X_(L)-binding polypeptide-dependent orBcl-X_(L)-binding polypeptide-affected pathway.

[0165] Construction of Polypeptide Knockout or Overexpressing Cell Lines

[0166] Characterization of Bcl-X_(L)-binding polypeptide genes alsoallows Bcl-X_(L)-binding polypeptide cell culture models to bedeveloped, in which the Bcl-X_(L)-binding polypeptide is expressed orfunctions at a lower level than its wild-type counterpart cell. Suchcell lines can be developed using standard antisense technologies.Similarly, cell culture models of Bcl-X_(L)-binding polypeptideoverproduction or overactivation may be generated by integrating one ormore Bcl-X_(L)-binding polypeptide sequences into the genome, accordingto standard molecular biology techniques. Moreover, the effect ofBcl-X_(L)-binding polypeptide gene mutations (e.g., dominant genemutations) may be studied using cell cultures model in which the cellscontain and overexpress a mutated Bcl-X_(L)-binding polypeptide.

[0167] Bcl-X_(L)-binding polypeptide knockout cells provide a tool forstudying the role of Bcl-X_(L)-binding polypeptide in cellular events,including apoptosis. Moreover, such cell lines provide the cell culturemeans, for testing therapeutic compounds for modulation of theapoptototic pathway. Compounds that modulate apoptosis in these cellmodels can then be tested in animal models of diseases or conditionsinvolving the apoptotic pathway.

Other Embodiments

[0168] In other embodiments, the invention includes any polypeptide thatis substantially identical to a Bcl-X_(L)-binding polypeptide; suchhomologues include other substantially pure naturally-occurringBcl-X_(L)-binding polypeptides as well as natural mutants; inducedmutants; DNA sequences that encode polypeptides and also hybridize tothe nucleic acid sequence encoding a Bcl-X_(L)-binding polypeptidedescribed herein under high stringency conditions or, less preferablyunder low stringency conditions (e.g., washing at 2× SSC at 40° C. witha probe length of at least 40 nucleotides); and proteins specificallybound by antisera directed to a Bcl-X_(L)-binding polypeptide. Theinvention also includes chimeric polypeptides that include aBcl-X_(L)-binding polypeptide portion.

[0169] The invention further includes analogs of any naturally-occurringBcl-X_(L)-binding polypeptide. Analogs can differ from thenaturally-occurring Bcl-X_(L)-binding polypeptide by amino acid sequencedifferences, by post-translational modifications, or by both. Analogs ofthe invention will generally exhibit at least 85%, more preferably, 90%,and most preferably, 95% or even 99% identity with all or part of anaturally-occurring Bcl-X_(L)-binding polypeptide sequence. The lengthof sequence comparison is at least 5 amino acid residues, preferably, atleast 10 amino acid residues, and more preferably, the full length ofthe polypeptide sequence. Modifications include in vivo and in vitrochemical derivatization of polypeptides, e.g., acetylation,carboxylation, phosphorylation, or glycosylation; such modifications mayoccur during polypeptide synthesis or processing or following treatmentwith isolated modifying enzymes. Analogs can also differ from thenaturally-occurring Bcl-X_(L)-binding polypeptide by alterations inprimary sequence. These include genetic variants, both natural andinduced (for example, resulting from random mutagenesis by irradiationor exposure to ethanemethylsulfate or by site-specific mutagenesis asdescribed in Sambrook, Fritsch and Maniatis, Molecular Cloning: ALaboratory Manual (2d ed.), CSH Press, 1989, or Ausubel et al., supra).Also included are cyclized peptides, molecules, and analogs that containresidues other than L-amino acids, e.g., D-amino acids or non-naturallyoccurring or synthetic amino acids, e.g., β or γ amino acids.

[0170] All publications and patent applications mentioned in thisspecification are herein incorporated by reference to the same extent asif each independent publication or patent application was specificallyand individually indicated to be incorporated by reference.

[0171] While the invention has been described in connection withspecific embodiments thereof, it will be understood that it is capableof further modifications and this application is intended to cover anyvariations, uses, or adaptations of the invention following, in general,the principles of the invention and including such departures from thepresent disclosure that come within known or customary practice withinthe art to which the invention pertains and may be applied to theessential features hereinbefore set forth, and follows in the scope ofthe appended claims.

1 253 1 35 PRT Homo sapiens 1 Ala Ser Met Arg Gln Ala Glu Pro Ala AspMet Arg Pro Glu Ile Trp 1 5 10 15 Ile Ala Gln Glu Leu Arg Arg Ile GlyAsp Glu Phe Asn Ala Tyr Tyr 20 25 30 Ala Arg Glu 35 2 18 PRT Homosapiens 2 Gly Gln Val Gly Arg Gln Leu Ala Ile Ile Gly Asp Asp Ile AsnArg 1 5 10 15 Arg Lys 3 32 PRT Homo sapiens 3 Lys Leu Ser Glu Cys LeuLys Arg Ile Gly Asp Glu Leu Asp Ser Asn 1 5 10 15 Met Glu Leu Gln ArgMet Ile Ala Ala Val Asp Thr Asp Ser Pro Arg 20 25 30 4 46 PRT Homosapiens 4 Thr Gly Lys Glu Ala Ile Leu Arg Arg Leu Val Ala Leu Leu GluGlu 1 5 10 15 Glu Ala Glu Val Ile Asn Gln Lys Leu Ala Ser Asp Pro AlaLeu Arg 20 25 30 Ser Lys Leu Val Arg Leu Ser Ser Asp Ser Phe Ala His Leu35 40 45 5 30 PRT Homo sapiens 5 Gln Arg Gly Met Leu Tyr Tyr Gln Thr GluLys Tyr Asp Leu Ala Ile 1 5 10 15 Lys Asp Leu Lys Glu Ala Leu Ile GlnLeu Arg Gly Asn Asn 20 25 30 6 38 PRT Homo sapiens 6 Gly Gly Glu Ser AspThr Asp Pro His Phe Gln Asp Ala Leu Met Gln 1 5 10 15 Leu Ala Lys AlaVal Ala Ser Ala Ala Ala Ala Leu Val Leu Lys Ala 20 25 30 Lys Ser Val AlaGln Arg 35 7 35 PRT Homo sapiens 7 Gly Thr Arg Gln Asp Arg Met Phe GluThr Met Ala Ile Glu Ile Glu 1 5 10 15 Gln Leu Leu Ala Arg Leu Thr GlyVal Asn Asp Lys Met Ala Glu Tyr 20 25 30 Thr Asn Ala 35 8 33 PRT Homosapiens 8 Ala Val Gln Glu Asp Pro Val Gln Arg Glu Ile His Gln Asp TrpAla 1 5 10 15 Asn Arg Glu Tyr Ile Glu Ile Ile Thr Ser Ser Ile Lys LysIle Ala 20 25 30 Asp 9 33 PRT Homo sapiens 9 Ala Thr Arg Gln Ala Leu AsnGlu Ile Ser Ala Arg His Ser Gly Ile 1 5 10 15 Gln Gln Leu Glu Arg SerIle Arg Glu Leu His Asp Ile Phe Thr Phe 20 25 30 Leu 10 28 PRT Homosapiens 10 Met Phe Ser Asp Ile Tyr Gly Ile Arg Glu Ile Ala Asp Gly LeuCys 1 5 10 15 Leu Glu Val Glu Gly Lys Met Val Ser Arg Pro Glu 20 25 1125 PRT Homo sapiens 11 Phe Trp Leu Glu Glu Arg Asp Phe Glu Ala Gly ValPhe Glu Leu Glu 1 5 10 15 Ala Ile Val Asn Ser Ile Lys Arg Ser 20 25 1240 PRT Homo sapiens 12 Met Lys Trp Asp Thr Asp Asn Thr Leu Gly Thr GluIle Ser Trp Glu 1 5 10 15 Asn Lys Leu Ala Glu Gly Leu Lys Leu Thr LeuAsp Thr Ile Phe Val 20 25 30 His His Val Leu His Ala Pro His 35 40 13 31PRT Homo sapiens 13 Arg Gly Ala Val Phe Ser Gln Asp Lys Asp Val Val GlnGlu Ala Thr 1 5 10 15 Lys Val Leu Arg Asn Ala Ala Asp Asn Phe Tyr IleAsn Asp Arg 20 25 30 14 33 PRT Homo sapiens 14 Thr Gly Thr Gly Ala ProArg Phe Ile Lys Glu Val Gln Glu Leu Asn 1 5 10 15 Ser Ala Leu His GlnSer Asp Leu Ile Asp Ile Tyr Arg Thr Leu His 20 25 30 Pro 15 20 PRT Homosapiens 15 Ser Asn Glu Leu Thr Arg Ala Val Glu Glu Leu His Lys Leu LeuLys 1 5 10 15 Glu Ala Arg Glu 20 16 33 PRT Homo sapiens 16 Thr Tyr TrpAsn Leu Leu Pro Pro Lys Arg Pro Ile Lys Glu Val Leu 1 5 10 15 Thr AspIle Phe Ala Lys Val Leu Glu Lys Gly Trp Val Asp Ser Arg 20 25 30 Ser 1719 PRT Homo sapiens 17 Leu Phe Thr Ile Leu Leu Thr Leu Trp Thr Met ArgCys Ser Ser Thr 1 5 10 15 Pro Ser Gly 18 28 PRT Homo sapiens 18 Ala GlyGlu Asp Met Glu Ile Ser Val Lys Glu Leu Arg Thr Ile Leu 1 5 10 15 AsnArg Ile Ile Ser Lys His Lys Asp Leu Arg Thr 20 25 19 26 PRT Homo sapiens19 Gly Leu Arg Glu Glu Ser Glu Glu Tyr Met Ala Ala Ala Asp Glu Tyr 1 510 15 Asn Arg Leu Lys Gln Val Lys Gln Pro Ala 20 25 20 67 PRT Homosapiens VARIANT 58, 62, 65 Xaa = Any Amino Acid 20 Lys Gly Ile Ile SerArg Leu Met Ser Val Glu Glu Glu Leu Lys Arg 1 5 10 15 Asp His Ala GluMet Gln Ala Gly Cys Gly Leu Gln Thr Glu Asp His 20 25 30 Leu Met Pro ArgArg Ser Ala Phe Ala Ser Leu Asp Ala Val Asn Ala 35 40 45 Arg Leu Met SerAla Leu Thr Pro Ala Xaa Arg Tyr Val Xaa His Cys 50 55 60 Xaa Pro Leu 6521 26 PRT Homo sapiens 21 Trp Glu Arg Ile Glu Glu Arg Leu Ala Tyr IleAla Asp His Leu Gly 1 5 10 15 Phe Ser Trp Thr Glu Leu Ala Arg Ala Leu 2025 22 27 PRT Homo sapiens 22 Ala Arg Gly Asp Phe Ala Gln Ala Ala Gln GlnLeu Trp Leu Ala Leu 1 5 10 15 Arg Ala Leu Gly Arg Pro Leu Pro Thr SerHis 20 25 23 30 PRT Homo sapiens 23 Gly Ser Ser Lys Asp Leu Ala Lys HisIle Gln Val Val Cys Asp Gly 1 5 10 15 Met Asp Leu Thr Pro Lys Ile HisAsp Leu Lys Pro Gln Cys 20 25 30 24 33 PRT Homo sapiens 24 Gly Phe LeuAla Ala Glu Gln Asp Ile Arg Glu Glu Ile Arg Lys Val 1 5 10 15 Val GlnSer Leu Glu Gln Thr Ala Arg Glu Val Leu Thr Leu Leu Gln 20 25 30 Gly 2533 PRT Homo sapiens 25 Leu Asp Pro Val Lys Asp Val Leu Ile Leu Ser AlaLeu Arg Arg Met 1 5 10 15 Leu Trp Ala Ala Asp Asp Phe Leu Glu Asp LeuPro Phe Glu Gln Ile 20 25 30 Gly 26 31 PRT Homo sapiens 26 Ala Asn LeuLeu Leu Leu Met Val Pro Ile Leu Ile Ala Met Ala Phe 1 5 10 15 Leu MetLeu Thr Glu Arg Lys Ile Leu Gly Tyr Ile Gln Pro Arg 20 25 30 27 30 PRTHomo sapiens 27 Leu Arg Leu Asn Thr Thr Val Trp Pro Thr Ile Ile Thr ProIle Leu 1 5 10 15 Leu Thr Leu Phe Leu Ile Thr Asn Arg Leu Ile Thr ThrArg 20 25 30 28 26 PRT Homo sapiens 28 Thr Leu Tyr Leu Lys Leu Thr AlaLeu Ala Val Thr Phe Leu Gly Leu 1 5 10 15 Leu Thr Ala Leu Asp Leu AsnTyr Pro Thr 20 25 29 44 PRT Homo sapiens 29 Ala Gly Val Phe Ser Ala GluPro Ser Pro Phe Pro Gln Thr Arg Arg 1 5 10 15 Ser Met Val Phe Ala ArgHis Leu Arg Glu Val Gly Asp Glu Phe Arg 20 25 30 Ser Arg His Leu Asn SerThr Asp Asp Ala Asp Glu 35 40 30 45 PRT Homo sapiens 30 Gly Leu Lys LeuAla Thr Val Ala Ala Ser Met Asp Arg Val Pro Lys 1 5 10 15 Val Thr ProSer Ser Ala Ile Ser Ser Ile Ala Arg Glu Asn His Glu 20 25 30 Pro Glu ArgLeu Gly Leu Asn Gly Ile Ala Glu Thr Thr 35 40 45 31 26 PRT Homo sapiens31 Met Arg Asp Leu Pro Gly His Tyr Tyr Glu Thr Leu Lys Phe Leu Val 1 510 15 Gly His Leu Lys Thr Ile Ala Asp His Arg 20 25 32 42 PRT Homosapiens 32 Cys Gly Gly Arg Met Glu Asp Ile Pro Cys Ser Arg Val Gly HisIle 1 5 10 15 Tyr Arg Lys Tyr Val Pro Tyr Lys Val Pro Ala Gly Val SerLeu Ala 20 25 30 Arg Asn Leu Lys Arg Val Ala Asp Trp Met 35 40 33 37 PRTHomo sapiens 33 Ala Leu Ser Trp Ile Glu Met Asp Thr Glu Met Glu Met LeuLeu Ala 1 5 10 15 Arg Phe Arg Arg Thr Pro Gly Asp Leu His Leu Asp HisSer Val His 20 25 30 Leu Cys Ala His Pro 35 34 11 PRT Homo sapiens 34Thr Ser Thr Leu Pro His Ile Arg Arg Thr Arg 1 5 10 35 12 PRT Homosapiens 35 Asn Gly Asn Leu Phe Ala Ser Phe Ile Ala Asp Ser 1 5 10 36 29PRT Homo sapiens 36 Ile Leu Thr Ser Pro Trp Thr Thr Ser Ser Gly Leu TrpPro Arg Leu 1 5 10 15 Gln Lys Ala Ala Glu Ala Phe Lys Gln Leu Asn GlnPro 20 25 37 32 PRT Homo sapiens 37 Arg Thr Leu Gln Pro Arg Leu Leu GlnAsn Gln Gln Gln His Leu Pro 1 5 10 15 Ala Leu Pro Ile Trp Phe Leu LeuGln Trp Leu Arg Leu His Pro Leu 20 25 30 38 37 PRT Homo sapiens 38 MetAla Val Ile Ile Asn Glu Leu Ser Gln Arg Asp Ser Cys Gly Pro 1 5 10 15Leu Lys Ile Ser Leu Asn Asn Lys Ile Leu Val Tyr Gly Asn Leu Phe 20 25 30Ser Ser Phe Thr Pro 35 39 16 PRT Homo sapiens 39 Gly Leu Ala Lys Lys SerLys Arg Asn Pro Ala Asn Leu Thr Pro Pro 1 5 10 15 40 20 PRT Homo sapiens40 Ser Ser Gln Ala Leu Arg Ile His Gln Trp Leu His Leu Phe Ser Asp 1 510 15 Phe Thr Ser Thr 20 41 18 PRT Homo sapiens 41 Gly Gln Val Gly ArgGln Leu Ala Ile Ile Gly Asp Asp Ile Asn Arg 1 5 10 15 Arg Lys 42 26 PRTHomo sapiens 42 Gly Val Ser Glu Ala Glu Gly Thr Phe Pro Leu Ser Thr PheLeu Leu 1 5 10 15 Gly Ile Ala Ser Arg Leu Arg Ser Val Ala 20 25 43 31PRT Homo sapiens 43 Arg Ala Pro Arg Phe Ile Lys Gln Ile Leu Leu Asp LeuLys Arg Glu 1 5 10 15 Ile Asp Phe Asn Val Arg Leu Val Glu Tyr Phe AsnPro Leu Ser 20 25 30 44 26 PRT Homo sapiens 44 Ile Val Ala Ile Ile AlaGly Arg Leu Arg Met Leu Gly Asp Gln Phe 1 5 10 15 Asn Gly Glu Leu GluAla Ser Ala Lys Asn 20 25 45 29 PRT Homo sapiens 45 Leu Ala Leu Ala TyrTyr Ser Ser Arg Gln Tyr Ala Ser Ala Leu Lys 1 5 10 15 His Ile Ala GluIle Ile Glu Arg Gly Ile Arg Gln His 20 25 46 38 PRT Homo sapiens 46 AlaAla Met Leu Leu Asp Arg Arg Gly Thr Glu Cys Asp Leu Trp Ile 1 5 10 15Asn Glu Met Ser Leu Leu His Lys Ile Val Gln Asp Val Tyr Gly Thr 20 25 30Pro His Pro Pro His Ser 35 47 22 PRT Homo sapiens 47 Pro Trp Gln Tyr LysPro Ile Ala Asp Leu Tyr Arg Gly Arg Glu Ser 1 5 10 15 Arg Pro Ser AlaPro Arg 20 48 18 PRT Homo sapiens 48 Leu Phe Ser Val Leu Leu Arg Tyr LeuAla Asp Asn Phe Leu Pro Gly 1 5 10 15 Gly Ser 49 18 PRT Homo sapiens 49Asp Trp Gln Val Leu Leu Gly Lys Leu Leu Trp Lys Ile Asp Asn Pro 1 5 1015 Gly Ile 50 22 PRT Homo sapiens 50 Gly Ala Met Glu Arg Glu Trp Ala MetPhe Leu Arg Ala Ala Ser Ser 1 5 10 15 Arg Ile Arg Gly Gly Val 20 51 24PRT Homo sapiens 51 Val His Asn Phe Gly Arg His Trp Gly Leu Pro Leu SerPhe Leu Leu 1 5 10 15 Asn Tyr Pro Leu Phe Leu Ser Pro 20 52 40 PRT Homosapiens 52 Ala Ser Met Ala Pro Val Gly Arg Asp Ala Glu Thr Leu Gln LysGln 1 5 10 15 Lys Glu Thr Ile Lys Ala Phe Leu Lys Lys Leu Glu Ala LeuMet Ala 20 25 30 Ser Asn Asp Asn Ala Asn Lys Thr 35 40 53 33 PRT Homosapiens 53 Cys Arg Glu Gln Ala Glu Leu Thr Gly Leu Arg Leu Ala Ser LeuGly 1 5 10 15 Leu Lys Phe Asn Lys Ile Val His Ser Ser Met Thr Arg AlaIle Glu 20 25 30 Thr 54 22 PRT Homo sapiens 54 Gly Thr Arg Ile Ser AspMet Leu Lys Leu Ile Ala Asp Thr Trp Gln 1 5 10 15 Arg Asn Cys Cys ProAla 20 55 26 PRT Homo sapiens 55 Glu Gln Ala Ser Val Lys Tyr Val Ile LeuAsp Met Tyr Arg Ala Leu 1 5 10 15 Leu Thr Leu Met Asn Thr Ser Thr AlaThr 20 25 56 20 PRT Homo sapiens 56 Glu Asp Leu Glu Ser Val Leu Ile ArgLeu Ile Asn Trp Ala Lys Gly 1 5 10 15 Ser Pro Ile Pro 20 57 25 PRT Homosapiens 57 Arg Pro Val Ser Phe Cys Gly Ala Val Trp Thr Leu Asn Arg AlaIle 1 5 10 15 Gly Arg His Phe Val Arg Gly Ser Arg 20 25 58 29 PRT Homosapiens 58 His Ala Val Val Ala Arg Leu Leu His Ile Gly Ala Ile Met PheGln 1 5 10 15 Arg Leu Asp Phe Ile Glu Gln Leu Ser Ala Pro Pro Ala 20 2559 31 PRT Homo sapiens 59 Gly Gln Gly Thr Leu Trp Gly Ser Gly Met GluAla Trp Leu Ala Thr 1 5 10 15 Val Leu Lys Ala Leu Pro Trp His Pro ThrTyr Gln Leu Glu Pro 20 25 30 60 28 PRT Homo sapiens 60 Ile Ala Gln AlaThr Lys Ala Thr Ile Asp Lys Trp Asn Cys Ile Lys 1 5 10 15 Leu Lys IlePhe Tyr Thr Ser Lys Lys Glu Ala Ser 20 25 61 22 PRT Homo sapiens 61 ValVal Asp Val Pro Asp Phe Ile Val Trp Leu Glu Glu Ala Val Ser 1 5 10 15Asp Leu His Arg Ala Leu 20 62 39 PRT Homo sapiens 62 Gln Arg Arg Gly AsnGlu Phe Gln Leu Arg Asp Leu Ala Asp Ala Trp 1 5 10 15 Asp Leu Ser SerArg Ser Arg Gln Arg Gly Trp Gln Met Pro Asn Cys 20 25 30 Arg Ser Arg ArgGly Pro Gly 35 63 18 PRT Homo sapiens 63 Arg Gly Leu Trp Val Asp Arg ValLeu Glu Glu Trp Gly Leu Glu Pro 1 5 10 15 Arg Gln 64 28 PRT Homo sapiens64 Phe Val Arg Ser Val Gly Trp Arg Leu Gln Asn Ile Gly Asp Asp Met 1 510 15 Asp His Ala Ile Cys Gly His Asp Val Arg Leu Gly 20 25 65 13 PRTHomo sapiens 65 Ser Gly Leu Arg Lys Pro Thr Cys Gly Ser Ser Gln Arg 1 510 66 25 PRT Homo sapiens 66 Ala Gly Thr Gln Pro Leu Ile Leu Ala Gln PheMet Arg Val Gly Gly 1 5 10 15 Asp Glu Leu Leu His Phe Leu Leu Trp 20 2567 32 PRT Homo sapiens 67 Met Asp Thr Ile Lys Gly Phe Asp Leu Ile ThrAsn Phe Gln Val Val 1 5 10 15 Ala Asp Ala Leu Asn Ile Ser Leu Leu ProAsn Pro Leu Ala Thr Ala 20 25 30 68 22 PRT Homo sapiens 68 Ala Thr TrpMet Lys Thr Leu Gln Gly Leu Leu Asp Arg Ile Gln Ala 1 5 10 15 Phe ProSer Ser Pro His 20 69 30 PRT Homo sapiens 69 Glu Ala Asn Arg Lys Gln ProLys Pro Asn Asn Ser Ser Thr Ala Tyr 1 5 10 15 Tyr Asn Phe Thr Gly ValSer Ile Leu Pro Ser Tyr Lys Pro 20 25 30 70 16 PRT Homo sapiens 70 GlySer Leu Thr His His Ile Asn Asn Ile Lys Pro Ser Ser Thr Arg 1 5 10 15 7128 PRT Homo sapiens 71 Val Ser Cys Trp Pro Ser Tyr Leu Lys Tyr Pro LeuSer Thr Ala Ser 1 5 10 15 Ala Ser Leu Leu Ala Thr Gln Leu Lys Ser IleAla 20 25 72 199 DNA Homo sapiens 72 taatacgact cactataggg acaattactatttacaattc ttacttcaca atggcttcca 60 tgaggcaggc tgaacctgca gatatgcgcccagagatatg gatcgcccaa gagttgcggc 120 gtattggaga cgagtttaac gcctactatgcaagggagga ttacaaagac gatgacgata 180 aggcatccgc tatttaaaa 199 73 126 DNAHomo sapiens 73 tactatttac aattctccta acacaatggg ggcaggtggg gacggcagctcgccatcatc 60 ggggacgaca tcaaccgacg gaaagattac aaagacgatg acgataaggcatccgctatt 120 aaaaaa 126 74 160 DNA Homo sapiens 74 tttacaattctcctaacaca atgaagctga gcgagtgtct caagcgcatc ggggacgaac 60 tggacagtaacatggagctg cagaggatga ttgccgccgt ggacacagac tccccccgag 120 attacaaagacgatgacgat aaggcatccg ctattaaaaa 160 75 232 DNA Homo sapiens 75taatacgact cactataggg acaattacta tttacaattc tttctctaca atgacaggga 60aggaagccat actgcggagg ctggtggccc tgctggagga ggaggcagaa gtcattaacc 120agaagctggc ctcggacccc gccctgcgca gcaagctggt ccgcctgtcc tccgactctt 180tcgcccacct ggattacaaa gacgatgacg ataaggcatc cgctatttaa aa 232 76 172 DNAHomo sapiens 76 gactcactat agggacaatt actatttaca attcttactt ccaacgagggatgctctact 60 accagacaga gaaatatgat ttggctatca aagaccttaa agaagccttgattcagcttc 120 gagggaacaa tgattacaaa gacgatgacg ataaggcatc cgctatttaa aa172 77 208 DNA Homo sapiens 77 taatacgact cactataggg acaattactatttacaattc tcctaacaca atgggtgggg 60 aaagtgatac tgacccccac ttccaggatgcgctaatgca gctcgccaaa gctgtggcaa 120 gtgctgcagc tgccctggtc ctcaaggccaagagtgtggc ccaacgagat tacaaagacg 180 atgacgatag ggcatccgct atttaaaa 20878 199 DNA Homo sapiens 78 taatacgact cactataggg acaattacta tttacaattctttctctaca atgggaacac 60 gccaagacag aatgtttgag acaatggcga ttgagattgaacaacttttg gcaaggctta 120 caggggtaaa tgataaaatg gcagaatata ccaacgctgattacaaagac gatgacgata 180 aggcatccgc tatttaaaa 199 79 181 DNA Homosapiens 79 ctatttacaa ttctcctaac acaatggcgg tacaggagga tccggtgcagcgggagattc 60 accaggactg ggctaaccgg gagtacattg agataatcac cagcagcatcaagaaaatcg 120 cagactttct caactcgttc gattacaaag acgatgacga taaggcatccgctattaaaa 180 a 181 80 208 DNA Homo sapiens 80 taatacgact cactatagggacaattacta tttacaattc tcctaacaca atggcgactc 60 gacaggcctt aaatgagatctcggcccggc acagtgggat ccagcagctt gaacgcagta 120 ttcgtgagct gcacgacatattcacttttc tggctaccga agtgcgagat tacaaagacg 180 atgacgataa ggcatccgctatttaaaa 208 81 178 DNA Homo sapiens 81 taatacgact cactataggg acaattactatttacaattc tttctctaca atgatgttct 60 ccgacatcta cgggatccgg gagatcgcggacgggttgtg cctggaggtg gaggggaaga 120 tggtcagtag gccagaggat tacaaagacgatgacgataa ggcatccgct atttaaaa 178 82 169 DNA Homo sapiens 82 taatacgactcactataggg acaattacta tttacaattc tcctaacaca atgttttggc 60 tggaagaaagggactttgag gcgggtgttt ttgaactaga agcaattgtt aacagcatca 120 aaagaagcgattacaaagac gatgacgata aggcatccgc tatttaaaa 169 83 214 DNA Homo sapiens83 taatacgact cactataggg acaattacta tttacaattc ttacttcaat acaatgaaat 60gggacacaga caatactcta gggacagaaa tctcttggga gaataagttg gctgaagggt 120tgaaactgac tcttgatacc atatttgtac atcacgtcct gcatgcccca cacgattaca 180aagacgatga cgataaggca tccgctattt aaaa 214 84 187 DNA Homo sapiens 84taatacgact cactataggg acaattacta tttacaattc tttctctaca atgcgggggg 60cagtgttctc ccaggataag gacgtcgtgc aggaggccac aaaggtgctg aggaatgctg 120ccgacaactt ctacatcaac gacagggatt acaaagacga tgacgataag gcatccgcta 180tttaaaa 187 85 190 DNA Homo sapiens 85 gactcactat agggacaatt actatttacaattctcctaa cacaatgacc ggtacaggag 60 cacccagatt cataaaggaa gtccaggaattgaactcagc tctacatcaa tcggacctaa 120 tagacatcta cagaactctc caccccgctgattacaaaga cgatgacgat aaggcatccg 180 ctatttaaaa 190 86 130 DNA Homosapiens 86 tttacaattc tcctaacaca atgacaaaga gcaatgaact aacccgggcagtagaggaac 60 tacacaaact tttgaaagaa gctagggaag attacaaaga cgatgacgataaggcatccg 120 ctatttaaaa 130 87 199 DNA Homo sapiens 87 taatacgactcactataggg acaattacta tttacaattc tcctaacaca atgacctact 60 ggaacctgctgccccccaag cggcccatca aagaggtgct gacggacatc tttgccaagg 120 tgctggagaagggctgggtg gacagccgct ccatccacga ttacaaagac gatgacgata 180 aggcatccgctatttaaaa 199 88 97 DNA Homo sapiens 88 ctatttacaa ttctcctaac actatggactatgagatgct cttcaactcc ttcagggatt 60 acaaagacga tgacgataag gcatccgctattaaaaa 97 89 178 DNA Homo sapiens 89 taatacgact cactataggg acaattactatttacaattc tttctctaca atggccgggg 60 aggacatgga gatcagcgtg aaggagttgcggacaatcct caataggatc atcagcaaac 120 acaaagacct gcggaccgat tacaaagacgatgacgataa ggcatccgct atttaaaa 178 90 172 DNA Homo sapiens 90 taatacgactcactataggg acaattacta tttacaattc tcctaacaca atgggactaa 60 gagaagaaagtgaagagtac atggctgctg ctgatgaata caatagactg aagcaagtga 120 agcaacctgcagattacaaa gacgatgacg ataaggcatc cgctatttaa aa 172 91 318 DNA Homosapiens 91 taatacgact cactataggg acaattacta tttacaattc tttctctacaatgaagggca 60 tcatcagcag gttgatgtcc gtggaggaag aactgaagag ggaccacgcagagatgcaag 120 cggctgtgga ctccaaacag aagatcattg atgcccagga gaagcgcattgcctcgttgg 180 atgccgccaa tgcccgcctc atgagtgccc tgacccagct gaaagagaggtacagcatgc 240 aagcccgtaa cggcatctcc cccaccaacc ccgcggatta caaagacgatgacgataagg 300 catccactat ttaaaaaa 318 92 172 DNA Homo sapiens 92taatacgact cactataggg acaaatacta tttacaattc tcctaacaca atgtgggaac 60ggattgagga aaggctggct tatattgctg atcaccttgg cttcagctgg acagaattag 120caagagcgct ggattacaaa gacgatgacg ataaggcatc cgctatttaa aa 172 93 177 DNAHomo sapiens 93 taatacgact cactataggg gacaattact atttacaatt gcttacttcacaatggctcg 60 gggagacttt gcccaggctg cccagcagct gtggctggcc ctgcgggcactgggccggcc 120 cctgcccacc tcccacgatt acaaagacga tgacgataag gcatccgctatttaaaa 177 94 160 DNA Homo sapiens 94 taatacgact cactataggg acaattactatttacaattc tttctctaca atggtggtgg 60 atgtgccaga ttttatagtc tggcttgaggaggcagtatc tgatttacat agggccctcg 120 attacaaaga cgatgacgat aaggcatccgctatttaaaa 160 95 170 DNA Homo sapiens misc_feature 167 n = A,T,C or G95 cttttacaat tctcctaaca caatgggctt tttggctgcc gagcaggaca tccgagagga 60aatcagaaaa gttgtacaga gtttagaaca aacagctcga gaggttttaa ctctactgca 120aggggtccag gattacaaag acgatgacga taaggcatcc gctaagnaaa 170 96 227 DNAHomo sapiens 96 ttaatacgac tcactatagg gattactatt tacaattctt acttcacaatgctggaccct 60 gtaaaggatg ttctaattct ttctgctctg agacgaatgc tatgggctgcagatgacttc 120 ttagaggatt tgccttttga gcaaataggg aatctaaggg aggaaattatcaactgtgca 180 caagcggatt acaaagacga tgacgataag gcatccgcta tttaaaa 22797 161 DNA Homo sapiens misc_feature 158 n = A,T,C or G 97 ttctatttacaattctccta acacaatggc caacctccta ctcctcatgg tacccattct 60 aatcgcaatggcattcctaa tgcttaccga acgaaaaatt ctaggctata tacaaccacg 120 cgattacaaagacgatgacg ataaggcatc cgctaaanaa a 161 98 149 DNA Homo sapiensmisc_feature 16 n = A,T,C or G 98 aattctccta acacantgct ccggctaaatactaccgtat ggcccaccat aattaccccc 60 atactcctta cactattcct catcaccaaccgactaatca ccacccggga ttacaaagac 120 gatgacgata aggcatccgc tatttaaaa 14999 146 DNA Homo sapiens misc_feature 140 n = A,T,C or G 99 ctatttacaattctcctaac acaatgaccc tctacctaaa actcacagcc ctcgctgtca 60 ctttcctaggacttctaaca gccctagacc tcaactaccc aaccgattac aaagacgatg 120 acgataaggcatccgctatn aaaaaa 146 100 226 DNA Homo sapiens 100 taatacgact cactatagggacaattacta tttacaattc tcctaacaca atggcgggcg 60 tgttctcagc cgagccgtcgccgtttccac agacccgtcg cagcatggtg tttgccaggc 120 acctgcggga ggtgggagacgagttcagga gcagacatct caactccacg gacgacgcag 180 acgaggatta caaagacgatgacgataagg catccgctat ttaaaa 226 101 229 DNA Homo sapiens 101 taatacgactcactataggg acaattacta tttacaattc tttctctaca atgggcttaa 60 aacttgccacagttgctgcc agtatggaca gagtgccaaa ggttactccc agcagtgcca 120 tcagcagcatagcaagagag aaccacgaac cagaaagatt gggcttaaat ggaatagcag 180 agacaacagattacaaagac gatgacgata aggcatccgc tatttaaaa 229 102 172 DNA Homo sapiens102 taatacgact cactataggg acaattacta tttacaattc tcctaacaca atgatgcggg 60atctcccagg acactactat gaaacgctca aattccttgt gggccatctc aagaccatcg 120ctgaccaccg cgattacaaa gacgatgacg ataaggcatc cgctatttaa aa 172 103 225DNA Homo sapiens 103 taatacgact cactataggg acaattacta tttacaattctttctctagg tgtggatgtg 60 tgggggccgc atggaggaca tcccctgctc cagggtgggccatatctaca ggaagtatgt 120 gccctacaag gtcccggccg gagtcagcct ggcccggaaccttaagcggg tggccgattg 180 gatggattac aaagacgatg acgataaggc atccgctatttaaaa 225 104 205 DNA Homo sapiens 104 taatacgact cactataggg acaattactatttacaattc tttctctaca atggcgctta 60 gttggatcga aatggacacc gagatggagatgcttctggc tagatttcgc agaaccccag 120 gagacctgca tttagaccac tctgtccatttgtgtgccca ccccgattac aaagacgatg 180 acgataaggc atccgctatt taaaa 205 105101 DNA Homo sapiens 105 ctatttacaa ttctcctaac acaatgacct ccaccctaccacacattcga agaacccgtg 60 attacaaaga cgatgacgat aaggcatccg ctatttaaaa a101 106 130 DNA Homo sapiens 106 taatacgact cactataggg acaattactatttacaattc tcctaacaca atgaacggaa 60 atctgttcgc ttcattcatc gccgacagtgattacaaaga cgatgacgat aaggcatccg 120 ctatttaaaa 130 107 164 DNA Homosapiens 107 taatacgact cactataggg acaattacta tttacaattc ttacttcgccctggacgaca 60 tcgagtggtt tgtggccccg gctgcagaag gcagccgagg ctttcaagcagctgaaccag 120 cccgattaca aagaccatga cgataaggca tccgctattt aaaa 164 108192 DNA Homo sapiens 108 taatacgact cctataggga caattactat ttacaattcttacttcaata caatgcgcac 60 cctgcaaccc aggcttcttc aaaaccaaca acagcacctgccagccctgc ccatatggtt 120 cctactccaa tggctcagac tgcacccgct ggattacaaagacgatgacg ataaggcatc 180 cgctatttaa aa 192 109 210 DNA Homo sapiens 109taatacgact cactataggg acaattacta tttacaattc tcctaacgcc aaagcacaat 60ggctgttata attaacgaat tatctcagcg tgacagctgt ggtcctttga aaattagctt 120gaataacaag atcctggtgt atggtaattt attttcctct ttcacccccg attacaaaga 180cgatgacgat aaggcatccg ctatttaaaa 210 110 109 DNA Homo sapiens 110caattctcct aacacgatgg gactggctaa aaaaagtaaa aggaacccgg caaatcttac 60cccgcctgat tacaaagacg atgacgataa ggcatccgct atttaaaaa 109 111 131 DNAHomo sapiens misc_feature 1, 125, 126 n = A,T,C or G 111 natttctatttacaattctc ctaacacaat gagctcacag gcacttagaa tccatcagtg 60 gctccatcttttctcagact tcacctccac cgattacaaa gacgatgacg ataaggcatc 120 cgctnnaaaa a131 112 172 DNA Homo sapiens 112 taatacgact cactataggg acaattactatttacaattc tttctctaca atggaccaac 60 ccataggaaa atgggaaaag ttgttcccgttacaacttta caaaacgtta caaatgctca 120 tgtcccagat ggattacaaa gacgatgacgataaggcatc cgctatttaa aa 172 113 172 DNA Homo sapiens 113 taatacgactcactataggg acaattacta tttacaattc ttacttcaca atgggggtct 60 ctgaggccgagggaacattc ccgctcagca ctttccttct tgggatagca tcccgtctaa 120 gaagcgtggctgattacaaa gacgatgacg ataaggcatc cgctatttaa aa 172 114 187 DNA Homosapiens 114 taatacgact cactataggg acaattacta tttacaattc tcctaacacaatgagggcgc 60 ccagattcat aaagcaaata ttgctagatc taaagagaga gatagacttcaatgtgagat 120 tagtagaata cttcaaccca ctatcagatt acaaagacga tgacgataaggcatccgcta 180 tttaaaa 187 115 172 DNA Homo sapiens 115 taatacgactcactataggg acaattacta tttacaattc tttctctaca atgatcgtgg 60 ctatcattgctggtcgcctt cggatgttgg gtgaccagtt caacggagaa ttggaagctt 120 ctgccaaaaacgattacaaa gacgatgacg ataaggcatc cgctatttaa aa 172 116 180 DNA Homosapiens 116 taatacgact cactataggg acaattacta tttacaattc tttctctacaacctggcttt 60 ggcctattac agcagccgac agtatgcttc agcactgaag catatcgctgagattattga 120 gcgtggcatc cgccagcacg attacaaaga cgatgacgat aaggcatccgctatttaaaa 180 117 208 DNA Homo sapiens 117 taatacgact cactatagggacaattacta tttacaattc tttctctacg atggctgcca 60 tgttattaga cagaagaggaactgagtgtg acctctggat aaatgagatg tcactattac 120 ataagattgt tcaagatgtatatggaactc ctcacccgcc ccactccgat tacaaagacg 180 atgacgataa ggcatccgctatttaaaa 208 118 160 DNA Homo sapiens 118 taatacgact cactatagggacaattacta tttacaattc tcctaacaca atgccttggc 60 aatacaaacc gatagctgatctttacagag ggagagagag ccgtccctct gccccccggg 120 attacaaaga cgatgacgataaggcatccg ctatttaaaa 160 119 148 DNA Homo sapiens 119 taatacgactcactataggg acaattacta tttacaattc tttctctaca atgctgttct 60 cagtgttgctacgttatttg gcagataact ttctgccagg aggatccgat tacaaagacg 120 atgacgataaggcatccgct atttaaaa 148 120 147 DNA Homo sapiens 120 taatacgactcactataggg acaattacta tttacaattc tcctaacaca atggattggc 60 aggtgttgctaggaaaacta ctttggaaaa tagataatcc gggcatcgat tacaaagacg 120 atgacgataggcatccgcta tttaaaa 147 121 160 DNA Homo sapiens 121 taatacgactcactataggg acaattacta tttacaattc tttctctaca atgggtgcta 60 tggagagagaatgggcgatg tttctcaggg ctgcttcaag caggattagg ggtggcgtgg 120 attacaaagacgatgacgat aaggcatccg ctgtttaaaa 160 122 140 DNA Homo sapiens 122ctatttacaa ttctcctaac acaatggtgc ataactttgg gagacactgg ggtctgccct 60tgagttttct tctcaattac cctttattcc tcagtccgga ttacaaagac gatgacgata 120aggcatccgc tattaaaaaa 140 123 211 DNA Homo sapiens 123 taatacgactcactatagga aatactattt acaattctta cttcacaatg gctagcatgg 60 ctccagtggggagagatgca gaaacattgc aaaagcaaaa ggaaactata aaagcctttc 120 taaagaaactagaagccctc atggcaagca atgacaatgc caataaaacc gatgacaaag 180 acgatgacgataaggcatcc gctatttaaa a 211 124 196 DNA Homo sapiens 124 taatacgactcactataggg acaattacta tttacaattc tttctctaca atgtgtcggg 60 agcaggctgaactcactggg ctccgcctgg caagcttggg gttgaagttt aataaaatcg 120 tccattcgtctatgacgcgc gccatagaga ccaccgatta caaagacgat gacgataagg 180 catccgctatttaaaa 196 125 161 DNA Homo sapiens 125 taatacgact cactataggg gacaattactatttacaatt cttacttcac aatgggcact 60 agaattagtg atatgctaaa attaattgcagacacatggc agagaaattg ttgccctgcg 120 gattacaaag acgatgacga taaggcatccgctatttaaa a 161 126 172 DNA Homo sapiens 126 taatacgact cactatagggacaattacta tttacaattc tcctaacaca atggagcagg 60 ccagtgttaa gtatgttattctggatatgt acagagcact cttgacacta atgaatactt 120 caacagccac agattacaaagacgatgacg ataaggcatc cgctatttaa aa 172 127 120 DNA Homo sapiens 127caattctcct aacacaatgg aagacctaga gagtgtgtta ataagactga tcaactgggc 60aaaaggaagc cccatcccag attacaaaga cgatgacgat aaggcatccg ctatttaaaa 120128 169 DNA Homo sapiens 128 taatacgact cactataggg acaattacta tttacaattctcctaacaca atgaggccgg 60 tgtccttttg cggggctgtt tggactctga acagggcaataggaaggcat tttgtccgag 120 gtagcaggga ttacaaagac gatgacgata aggcatccgctatttaaaa 169 129 181 DNA Homo sapiens 129 taatacgact cactatagggacaattacta tttacaattc tttctctaca atgcacgcgg 60 tggtggcacg tttgcttcacattggggcaa tcatgttcca acgactagac ttcatagaac 120 aattgtctgc acccccagcggattacaaag acgatgacga taaggcatcc gctatttaaa 180 a 181 130 159 DNA Homosapiens misc_feature 155 n = A,T,C or G 130 cttttacaat tctcctaacacaatgggcca aggtacactt tggggaagtg ggatggaagc 60 atggttggca acggtgttgaaggcactccc ttggcacccc acataccagc tggagccgga 120 ttacaaagac gatgacgataaggcatccgc tatanaaaa 159 131 148 DNA Homo sapiens misc_feature 147 n =A,T,C or G 131 ttctatttac aattctccta acacaatgat agcacaggca acgaaagcaacaatagacaa 60 atggaactgc atcaaactta aaatcttcta cacctcaaag aaagaagccagcgattacaa 120 agacgatgac gataaggcat ccgctant 148 132 160 DNA Homosapiens 132 taatacgact cactataggg acaattacta tttacaattc tttctctacaatggtggtgg 60 atgtgccaga ttttatagtc tggcttgagg aggcagtatc tgatttacatagagccctcg 120 attacaaaga cgatgacgat aaggcatccg ctatttaaaa 160 133 211DNA Homo sapiens 133 taatacgact cactataggg acaattacta tttacaattctttctctaca atgcagagga 60 gagggaatga attccagctg agagacctgg ccgatgcatgggatttgtct tcaaggtcca 120 ggcagagggg atggcagatg ccaaattgca gaagtcgaagagggcccgga gattacaaag 180 acgatgacga taaggcatcc gctatttaaa a 211 134 118DNA Homo sapiens 134 tttacaattc tcctaacaca atgcggggcc tgtgggtggacagggtccta gaggaatggg 60 gcctggaacc gcggcaggat tacaaagacg atgacgataaggcatccgct attaaaaa 118 135 179 DNA Homo sapiens 135 taatacgactcactataggg acaattacta tttacaattc tttactctac aatgttcgtg 60 aggtctgttggctggaggct gcagaacatt ggtgatgaca tggaccacgc catttgtggc 120 catgatgtcaggctcggcga ttacaaagac gatgacgata aggcatccgc tatttaaaa 179 136 82 DNAHomo sapiens 136 gcagtggact cagaaagcca acatgtggct cctcccagcg cgattacaaagacgatgacg 60 ataaggcatc cgctatttaa aa 82 137 169 DNA Homo sapiens 137taatacgact cactataggg acaattacta tttacaattc tttctctaca atggcgggta 60cacagccact tatccttgcc cagttcatgc gtgttggagg tgacgaactt ctccacttcc 120tgctctggga ttacaaagac gatgacgata aggcatccgc tatttaaaa 169 138 190 DNAHomo sapiens 138 taatacgact cactataggg acaattacta tttacaattc tcctaacaccatgatggata 60 ccataaaggg atttgaccta atcactaatt ttcaggtggt ggctgatgctttgaacatct 120 ctttgctgcc caatccatta gcgacagcgg attacaaaga cgatgacgataaggcatacg 180 ctatttaaaa 190 139 135 DNA Homo sapiens misc_feature 128n = A,T,C or G 139 tctatttaca attctcctaa cacaatggcc acttggatgaaaacccttca aggattactg 60 gatagaattc aggctttccc ctccagcccc cacgattacaaagacgatga cgataaggca 120 tccgctanga aaaaa 135 140 159 DNA Homo sapiens140 ctatttacaa ttctcctaac acaatggaag ctaatagaaa acaaccgaaa ccaaataatt 60caagcactgc ttattacaat tttactgggg tctctatttt accctcctac aagccccaga 120ttacaaagac gatgacgata aggcatccgc tataaaaaa 159 141 118 DNA Homo sapiensmisc_feature 112 n = A,T,C or G 141 ttctatttac aattctccta acacaatggggctcactcac ccaccacatt aacaacataa 60 aaccctcatc cacacgagat tacaaagacgatgacgataa ggcatccgct anaaaaaa 118 142 177 DNA Homo sapiens 142taatacgact catataggga caattactat ttacaattct tacttcacaa tggtgagctg 60ctggccgatt actaaaatac cctttgtcta cagcctccgc ttctctcctg gctacgcaat 120tgaaaagcat agcggattac aaagacgatg acgataaggc atccgctatt taaaaaa 177 14371 DNA Artificial Sequence Oligonucleotide Primer 143 taatacgactcactataggg acaattacta tttacaatth hhhhhhhaca atggctgaag 60 aacagaaact g71 144 39 DNA Artificial Sequence Oligonucleotide Primer 144 taatacgactcactataggg acaattacta tttacaatt 39 145 33 DNA Artificial Sequencemisc_feature 25, 26, 27, 28, 29, 30, 31, 32, 33 n = A,T,C or G 145ggaacttgct tcgtctttgc aatcnnnnnn nnn 33 146 33 DNA Artificial Sequencemisc_feature 25, 26, 27, 28, 29, 30, 31, 32, 33 n = A,T,C or G 146ggatgatgct tcgtctttgt aatcnnnnnn nnn 33 147 45 DNA Artificial Sequencemisc_feature 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 n = A,T,C or G 147ggacaattac tatttacaat thhhhhhhha caatgnnnnn nnnnn 45 148 39 DNAArtificial Sequence Oligonucleotide Primer 148 taatacgact cactatagggacaattacta tttacaatt 39 149 41 DNA Artificial Sequence OligonucleotidePrimer 149 ttttaaatag cgcatgcctt atcgtcatcg tctttgtaat c 41 150 30 DNAArtificial Sequence Oligonucleotide Primer 150 agtatcgaat tcatgtctcagagcaaccgg 30 151 35 DNA Artificial Sequence Oligonucleotide Primer 151tacagtctcg agctagttga agcgttcctg gccct 35 152 28 PRT Homo sapiens 152Met Gly Gln Val Gly Arg Gln Leu Ala Ile Ile Gly Asp Asp Ile Asn 1 5 1015 Arg Asp Tyr Lys Asp Asp Asp Asp Lys Ala Ser Ala 20 25 153 105 DNAHomo sapiens 153 gcttccatga ggcaggctga acctgcagat atgcgcccag agatatggatcgcccaagag 60 ttgcggcgta ttggagacga gtttaacgcc tactatgcaa gggag 105 15456 DNA Homo sapiens 154 ggggcaggtg gggacggcag ctcgccatca tcggggacgacatcaaccga cggaaa 56 155 96 DNA Homo sapiens 155 aagctgagcg agtgtctcaagcgcatcggg gacgaactgg acagtaacat ggagctgcag 60 aggatgattg ccgccgtggacacagactcc ccccga 96 156 138 DNA Homo sapiens 156 acagggaagg aagccatactgcggaggctg gtggccctgc tggaggagga ggcagaagtc 60 attaaccaga agctggcctcggaccccgcc ctgcgcagca agctggtccg cctgtcctcc 120 gactctttcg cccacctg 138157 78 DNA Homo sapiens 157 ctctactacc agacagagaa atatgatttg gctatcaaagaccttaaaga agccttgatt 60 cagcttcgag ggaacaat 78 158 114 DNA Homo sapiens158 ggtggggaaa gtgatactga cccccacttc caggatgcgc taatgcagct cgccaaagct 60gtggcaagtg ctgcagctgc cctggtcctc aaggccaaga gtgtggccca acga 114 159 105DNA Homo sapiens 159 ggaacacgcc aagacagaat gtttgagaca atggcgattgagattgaaca acttttggca 60 aggcttacag gggtaaatga taaaatggca gaatataccaacgct 105 160 114 DNA Homo sapiens 160 gcggtacagg aggatccggt gcagcgggagattcaccagg actgggctaa ccgggagtac 60 attgagataa tcaccagcag catcaagaaaatcgcagact ttctcaactc gttc 114 161 114 DNA Homo sapiens 161 gcgactcgacaggccttaaa tgagatctcg gcccggcaca gtgggatcca gcagcttgaa 60 cgcagtattcgtgagctgca cgacatattc acttttctgg ctaccgaagt gcga 114 162 84 DNA Homosapiens 162 atgttctccg acatctacgg gatccgggag atcgcggacg ggttgtgcctggaggtggag 60 gggaagatgg tcagtaggcc agag 84 163 75 DNA Homo sapiens 163ttttggctgg aagaaaggga ctttgaggcg ggtgtttttg aactagaagc aattgttaac 60agcatcaaaa gaagc 75 164 117 DNA Homo sapiens 164 aaatgggaca cagacaatactctagggaca gaaatctctt gggagaataa gttggctgaa 60 gggttgaaac tgactcttgataccatattt gtacatcacg tcctgcatgc cccacac 117 165 93 DNA Homo sapiens 165cggggggcag tgttctccca ggataaggac gtcgtgcagg aggccacaaa ggtgctgagg 60aatgctgccg acaacttcta catcaacgac agg 93 166 102 DNA Homo sapiens 166accggtacag gagcacccag attcataaag gaagtccagg aattgaactc agctctacat 60caatcggacc taatagacat ctacagaact ctccaccccg ct 102 167 66 DNA Homosapiens 167 acaaagagca atgaactaac ccgggcagta gaggaactac acaaacttttgaaagaagct 60 agggaa 66 168 105 DNA Homo sapiens 168 acctactggaacctgctgcc ccccaagcgg cccatcaaag aggtgctgac ggacatcttt 60 gccaaggtgctggagaaggg ctgggtggac agccgctcca tccac 105 169 30 DNA Homo sapiens 169gactatgaga tgctcttcaa ctccttcagg 30 170 84 DNA Homo sapiens 170gccggggagg acatggagat cagcgtgaag gagttgcgga caatcctcaa taggatcatc 60agcaaacaca aagacctgcg gacc 84 171 78 DNA Homo sapiens 171 ggactaagagaagaaagtga agagtacatg gctgctgctg atgaatacaa tagactgaag 60 caagtgaagcaacctgca 78 172 222 DNA Homo sapiens 172 aagggcatca tcagcaggttgatgtccgtg gaggaagaac tgaagaggga ccacgcagag 60 atgcaagcgg ctgtggactccaaacagaag atcattgatg cccaggagaa gcgcattgcc 120 tcgttggatg ccgccaatgcccgcctcatg agtgccctga cccagctgaa agagaggtac 180 agcatgcaag cccgtaacggcatctccccc accaaccccg cg 222 173 78 DNA Homo sapiens 173 tgggaacggattgaggaaag gctggcttat attgctgatc accttggctt cagctggaca 60 gaattagcaagagcgctg 78 174 81 DNA Homo sapiens 174 gctcggggag actttgccca ggctgcccagcagctgtggc tggccctgcg ggcactgggc 60 cggcccctgc ccacctccca c 81 175 66DNA Homo sapiens 175 gtggtggatg tgccagattt tatagtctgg cttgaggaggcagtatctga tttacatagg 60 gccctc 66 176 105 DNA Homo sapiens 176ggctttttgg ctgccgagca ggacatccga gaggaaatca gaaaagttgt acagagttta 60gaacaaacag ctcgagaggt tttaactcta ctgcaagggg tccag 105 177 135 DNA Homosapiens 177 ctggaccctg taaaggatgt tctaattctt tctgctctga gacgaatgctatgggctgca 60 gatgacttct tagaggattt gccttttgag caaataggga atctaagggaggaaattatc 120 aactgtgcac aagcg 135 178 93 DNA Homo sapiens 178gccaacctcc tactcctcat ggtacccatt ctaatcgcaa tggcattcct aatgcttacc 60gaacgaaaaa ttctaggcta tatacaacca cgc 93 179 90 DNA Homo sapiens 179ctccggctaa atactaccgt atggcccacc ataattaccc ccatactcct tacactattc 60ctcatcacca accgactaat caccacccgg 90 180 78 DNA Homo sapiens 180accctctacc taaaactcac agccctcgct gtcactttcc taggacttct aacagcccta 60gacctcaact acccaacc 78 181 132 DNA Homo sapiens 181 gcgggcgtgttctcagccga gccgtcgccg tttccacaga cccgtcgcag catggtgttt 60 gccaggcacctgcgggaggt gggagacgag ttcaggagca gacatctcaa ctccacggac 120 gacgcagacg ag132 182 135 DNA Homo sapiens 182 ggcttaaaac ttgccacagt tgctgccagtatggacagag tgccaaaggt tactcccagc 60 agtgccatca gcagcatagc aagagagaaccacgaaccag aaagattggg cttaaatgga 120 atagcagaga caaca 135 183 78 DNAHomo sapiens 183 atgcgggatc tcccaggaca ctactatgaa acgctcaaat tccttgtgggccatctcaag 60 accatcgctg accaccgc 78 184 126 DNA Homo sapiens 184tgtgggggcc gcatggagga catcccctgc tccagggtgg gccatatcta caggaagtat 60gtgccctaca aggtcccggc cggagtcagc ctggcccgga accttaagcg ggtggccgat 120tggatg 126 185 111 DNA Homo sapiens 185 gcgcttagtt ggatcgaaat ggacaccgagatggagatgc ttctggctag atttcgcaga 60 accccaggag acctgcattt agaccactctgtccatttgt gtgcccaccc c 111 186 33 DNA Homo sapiens 186 acctccaccctaccacacat tcgaagaacc cgt 33 187 36 DNA Homo sapiens 187 aacggaaatctgttcgcttc attcatcgcc gacagt 36 188 70 DNA Homo sapiens 188 gacgacatcgagtggtttgt ggccccggct gcagaaggca gccgaggctt tcaagcagct 60 gaaccagccc 70189 96 DNA Homo sapiens 189 cgcaccctgc aacccaggct tcttcaaaac caacaacagcacctgccagc cctgcccata 60 tggttcctac tccaatggct cagactgcac ccgctg 96 190108 DNA Homo sapiens 190 gctgttataa ttaacgaatt atctcagcgt gacagctgtggtcctttgaa aattagcttg 60 aataacaaga tcctggtgta tggtaattta ttttcctctttcaccccc 108 191 48 DNA Homo sapiens 191 ggactggcta aaaaaagtaaaaggaacccg gcaaatctta ccccgcct 48 192 60 DNA Homo sapiens misc_feature1, 125, 126 n = A,T,C or G 192 agctcacagg cacttagaat ccatcagtggctccatcttt tctcagactt cacctccacc 60 193 78 DNA Homo sapiens 193gaccaaccca taggaaaatg ggaaaagttg ttcccgttac aactttacaa aacgttacaa 60atgctcatgt cccagatg 78 194 78 DNA Homo sapiens 194 ggggtctctg aggccgagggaacattcccg ctcagcactt tccttcttgg gatagcatcc 60 cgtctaagaa gcgtggct 78195 93 DNA Homo sapiens 195 agggcgccca gattcataaa gcaaatattg ctagatctaaagagagagat agacttcaat 60 gtgagattag tagaatactt caacccacta tca 93 196 78DNA Homo sapiens 196 atcgtggcta tcattgctgg tcgccttcgg atgttgggtgaccagttcaa cggagaattg 60 gaagcttctg ccaaaaac 78 197 84 DNA Homo sapiens197 gctttggcct attacagcag ccgacagtat gcttcagcac tgaagcatat cgctgagatt 60attgagcgtg gcatccgcca gcac 84 198 114 DNA Homo sapiens 198 gctgccatgttattagacag aagaggaact gagtgtgacc tctggataaa tgagatgtca 60 ctattacataagattgttca agatgtatat ggaactcctc acccgcccca ctcc 114 199 66 DNA Homosapiens 199 ccttggcaat acaaaccgat agctgatctt tacagaggga gagagagccgtccctctgcc 60 ccccgg 66 200 54 DNA Homo sapiens 200 ctgttctcagtgttgctacg ttatttggca gataactttc tgccaggagg atcc 54 201 54 DNA Homosapiens 201 gattggcagg tgttgctagg aaaactactt tggaaaatag ataatccggg catc54 202 66 DNA Homo sapiens 202 ggtgctatgg agagagaatg ggcgatgtttctcagggctg cttcaagcag gattaggggt 60 ggcgtg 66 203 72 DNA Homo sapiens203 gtgcataact ttgggagaca ctggggtctg cccttgagtt ttcttctcaa ttacccttta 60ttcctcagtc cg 72 204 120 DNA Homo sapiens 204 gctagcatgg ctccagtggggagagatgca gaaacattgc aaaagcaaaa ggaaactata 60 aaagcctttc taaagaaactagaagccctc atggcaagca atgacaatgc caataaaacc 120 205 102 DNA Homo sapiens205 tgtcgggagc aggctgaact cactgggctc cgcctggcaa gcttggggtt gaagtttaat 60aaaatcgtcc attcgtctat gacgcgcgcc atagagacca cc 102 206 66 DNA Homosapiens 206 ggcactagaa ttagtgatat gctaaaatta attgcagaca catggcagagaaattgttgc 60 cctgcg 66 207 78 DNA Homo sapiens 207 gagcaggccagtgttaagta tgttattctg gatatgtaca gagcactctt gacactaatg 60 aatacttcaacagccaca 78 208 60 DNA Homo sapiens 208 gaagacctag agagtgtgtt aataagactgatcaactggg caaaaggaag ccccatccca 60 209 75 DNA Homo sapiens 209aggccggtgt ccttttgcgg ggctgtttgg actctgaaca gggcaatagg aaggcatttt 60gtccgaggta gcagg 75 210 87 DNA Homo sapiens 210 cacgcggtgg tggcacgtttgcttcacatt ggggcaatca tgttccaacg actagacttc 60 atagaacaat tgtctgcacccccagcg 87 211 93 DNA Homo sapiens misc_feature 155 n = A,T,C or G 211ggccaaggta cactttgggg aagtgggatg gaagcatggt tggcaacggt gttgaaggca 60ctcccttggc accccacata ccagctggag ccg 93 212 84 DNA Homo sapiensmisc_feature 147 n = A,T,C or G 212 atagcacagg caacgaaagc aacaatagacaaatggaact gcatcaaact taaaatcttc 60 tacacctcaa agaaagaagc cagc 84 213 66DNA Homo sapiens 213 gtggtggatg tgccagattt tatagtctgg cttgaggaggcagtatctga tttacataga 60 gccctc 66 214 117 DNA Homo sapiens 214cagaggagag ggaatgaatt ccagctgaga gacctggccg atgcatggga tttgtcttca 60aggtccaggc agaggggatg gcagatgcca aattgcagaa gtcgaagagg gcccgga 117 21554 DNA Homo sapiens 215 cggggcctgt gggtggacag ggtcctagag gaatggggcctggaaccgcg gcag 54 216 84 DNA Homo sapiens 216 ttcgtgaggt ctgttggctggaggctgcag aacattggtg atgacatgga ccacgccatt 60 tgtggccatg atgtcaggctcggc 84 217 39 DNA Homo sapiens 217 agtggactca gaaagccaac atgtggctcctcccagcgc 39 218 75 DNA Homo sapiens 218 gcgggtacac agccacttatccttgcccag ttcatgcgtg ttggaggtga cgaacttctc 60 cacttcctgc tctgg 75 21996 DNA Homo sapiens 219 atggatacca taaagggatt tgacctaatc actaattttcaggtggtggc tgatgctttg 60 aacatctctt tgctgcccaa tccattagcg acagcg 96 22066 DNA Homo sapiens misc_feature 128 n = A,T,C or G 220 gccacttggatgaaaaccct tcaaggatta ctggatagaa ttcaggcttt cccctccagc 60 ccccac 66 22192 DNA Homo sapiens 221 gaagctaata gaaaacaacc gaaaccaaat aattcaagcactgcttatta caattttact 60 ggggtctcta ttttaccctc ctacaagccc ca 92 222 49DNA Homo sapiens misc_feature 112 n = A,T,C or G 222 gggctcactcacccaccaca ttaacaacat aaaaccctca tccacacga 49 223 82 DNA Homo sapiens223 gtgagctgct ggccgattac taaaataccc tttgtctaca gcctccgctt ctctcctggc 60tacgcaattg aaaagcatag cg 82 224 11 PRT Homo sapiens 224 Lys Tyr Gln GlnLeu Phe Glu Asp Ile Arg Trp 1 5 10 225 16 PRT Homo sapiens 225 Ile GlyGlu Glu Phe Ser Arg Ala Ala Glu Lys Leu Tyr Leu Ala Val 1 5 10 15 226 23PRT Homo sapiens 226 Lys Ala Glu Val Gln Ile Ala Arg Lys Leu Gln Cys IleAla Asp Gln 1 5 10 15 Phe His Arg Leu His Val Leu 20 227 22 PRT Homosapiens 227 Met Gly Asp Val Val Gly Phe Ile Asp Glu Leu Glu Gly Ala ValSer 1 5 10 15 Asp Leu His Arg Ala Leu 20 228 15 PRT Homo sapiens 228 ThrLeu Arg His Trp Gly Leu Gln Phe Asn Thr Arg Phe Gly Val 1 5 10 15 229 14PRT Homo sapiens 229 Ser Arg Arg Glu Glu Ala Trp Asp Ala Leu Phe Arg GlyIle 1 5 10 230 17 PRT Homo sapiens 230 Thr Leu Arg Glu Ile Gly Asp LeuTyr Leu Thr Ser Ile Leu Gly Arg 1 5 10 15 Arg 231 33 DNA Homo sapiens231 aaataccagc aactttttga agatattcgg tgg 33 232 48 DNA Homo sapiens 232atcggggagg agttcagccg cgctgccgag aagctttacc tcgctgtt 48 233 69 DNA Homosapiens 233 aaagcagagg tacagattgc ccgaaagctt cagtgcattg cagaccagttccaccggctt 60 catgtgctt 69 234 66 DNA Homo sapiens 234 atgggagatgtggttggttt tatagacgaa cttgaggggg cagtgtctga tttacatagg 60 gcgttg 66 23545 DNA Homo sapiens 235 acactccgac actggggatt acagttcaac acaagatttggtgtg 45 236 42 DNA Homo sapiens 236 tcgagaaggg aagaggcatg ggatgctttatttcgtggga tc 42 237 42 DNA Homo sapiens 237 tcgagaaggg aagaggcatgggatgcttta tttcgtggga tc 42 238 18 PRT Homo sapiens 238 Met Pro Val ValHis Leu Thr Leu Thr Thr Ala Gly Asp Asp Phe Ser 1 5 10 15 Arg Arg 239 25PRT Homo sapiens 239 Met Pro Gln Asp Ala Ser Thr Lys Lys Leu Ser Glu CysLeu Lys Arg 1 5 10 15 Ile Gly Asp Glu Leu Asp Ser Asn Gly 20 25 240 17PRT Homo sapiens 240 Met Gly Gln Val Gly Arg Gln Leu Ala Ile Ile Gly AspAsp Ile Asn 1 5 10 15 Arg 241 138 PRT Homo sapiens 241 Met Ala Lys GlnPro Ser Asp Val Ser Ser Glu Cys Asp Arg Glu Gly 1 5 10 15 Arg Gln LeuGln Pro Ala Glu Arg Pro Pro Gln Leu Arg Pro Gly Ala 20 25 30 Pro Thr SerLeu Gln Thr Glu Pro Gln Asp Arg Ser Pro Ala Pro Met 35 40 45 Ser Cys AspLys Ser Thr Gln Thr Pro Ser Pro Pro Cys Gln Ala Phe 50 55 60 Asn His TyrLeu Ser Ala Met Ala Ser Met Arg Gln Ala Glu Pro Ala 65 70 75 80 Asp MetArg Pro Glu Ile Trp Ile Ala Gln Glu Leu Arg Arg Ile Gly 85 90 95 Asp GluPhe Asn Ala Tyr Tyr Ala Arg Arg Val Phe Leu Asn Asn Tyr 100 105 110 GlnAla Ala Glu Asp His Pro Arg Met Val Ile Leu Arg Leu Leu Arg 115 120 125Tyr Ile Val Arg Leu Val Trp Arg Met His 130 135 242 135 PRT Homo sapiens242 Met Asp Gly Ser Gly Glu Gln Pro Arg Gly Gly Gly Pro Thr Ser Ser 1 510 15 Glu Gln Ile Met Lys Thr Gly Ala Leu Leu Leu Gln Gly Phe Ile Gln 2025 30 Asp Arg Ala Gly Arg Met Gly Gly Glu Ala Pro Glu Leu Ala Leu Asp 3540 45 Pro Val Pro Gln Asp Ala Ser Thr Lys Lys Leu Ser Glu Cys Leu Lys 5055 60 Arg Ile Gly Asp Glu Leu Asp Ser Asn Met Glu Leu Gln Arg Met Ile 6570 75 80 Ala Ala Val Asp Thr Asp Ser Pro Arg Glu Val Phe Phe Arg Val Ala85 90 95 Ala Asp Met Phe Ser Asp Gly Asn Phe Asn Trp Gly Arg Val Val Ala100 105 110 Leu Phe Tyr Phe Ala Ser Lys Leu Val Leu Lys Ala Asp Val ValTyr 115 120 125 Asn Ala Phe Ser Leu Arg Val 130 135 243 110 PRT Homosapiens 243 Met Gly Ala Ala Met Ala Gly Gln Glu Asp Pro Val Gln Arg GluIle 1 5 10 15 His Gln Asp Trp Ala Asn Arg Glu Tyr Ile Glu Ile Ile ThrSer Ser 20 25 30 Ile Lys Lys Ile Ala Asp Phe Leu Asn Ser Phe Asp Met SerCys Arg 35 40 45 Ser Arg Leu Ala Thr Leu Asn Glu Lys Leu Thr Ala Leu GluArg Arg 50 55 60 Ile Glu Tyr Ile Glu Ala Arg Val Thr Lys Gly Glu Thr LeuThr Arg 65 70 75 80 Thr Val Pro Cys Cys Cys Trp Glu Val Ala Leu His AsnThr Gly His 85 90 95 Met Gly Lys Ala Pro Ala Ala Phe Ser Ser Phe Leu SerPro 100 105 110 244 122 PRT Homo sapiens 244 Met Ala Ala Val Leu Gln GlnVal Leu Glu Asn Ala His Ile Lys Leu 1 5 10 15 Ser Asn Leu Tyr Lys SerAla Ala Asp Asp Ser Glu Ala Lys Ser Asn 20 25 30 Glu Leu Thr Arg Ala ValGlu Glu Leu His Lys Leu Leu Lys Glu Ala 35 40 45 Gly Glu Ala Asn Lys AlaIle Gln Asp His Leu Leu Glu Val Glu Gln 50 55 60 Ser Lys Asp Gln Met GluLys Glu Met Leu Glu Lys Ile Gly Arg Leu 65 70 75 80 Glu Lys Glu Leu GluAsn Ala Asn Asp Leu Leu Ser Ala Thr Lys Arg 85 90 95 Lys Gly Ala Ile LeuSer Glu Glu Glu Leu Ala Ala Met Ser Pro Thr 100 105 110 Arg Gly Gly IleAsn Arg Gly Asn Ile Asn 115 120 245 19 PRT Homo sapiens 245 Arg Trp TrpMet Cys Gly Gly Arg Met Glu Asp Met Leu Cys Cys Arg 1 5 10 15 Val GlyHis 246 8 DNA Artificial Sequence Source Tag 246 aactcctc 8 247 8 DNAArtificial Sequence Source Tag 247 aatctacc 8 248 8 DNA ArtificialSequence Source Tag 248 aacaacac 8 249 8 DNA Artificial Sequence SourceTag 249 aatattcc 8 250 8 DNA Artificial Sequence Source Tag 250 ctcctaac8 251 8 DNA Artificial Sequence Source Tag 251 ctttctct 8 252 8 DNAArtificial Sequence Source Tag 252 cttacttc 8 253 8 DNA ArtificialSequence Source Tag 253 atttcaat 8

What is claimed is:
 1. A substantially pure human Bcl-X_(L)-bindingpolypeptide, said polypeptide consisting of the sequence of any of SEQID NOS: 4-50, 63-71, and 224-228.
 2. A substantially pure humanBcl-X_(L)-binding polypeptide, said polypeptide comprising the sequenceof any of SEQ ID NOS: 51-62, 229, and
 230. 3. An isolated nucleic acidmolecule encoding a polypeptide of claim 1 or
 2. 4. The isolated nucleicacid of claim 3, wherein said nucleic acid molecule consists of thesequence of any of SEQ ID NOS: 156-202, 215-223, and 231-235.
 5. Theisolated nucleic acid of claim 3, wherein said nucleic acid moleculecomprises the sequence of any of SEQ ID NOS: 203-214, 236, and
 237. 6. Avector comprising the isolated nucleic acid molecule of claim
 3. 7. Acell comprising the isolated nucleic acid molecule of claim
 3. 8. A cellcomprising the vector of claim
 6. 9. A method of identifying aBcl-X_(L)-binding polypeptide, said method comprising the steps of: (a)providing a population of source labeled nucleic acid-protein fusionmolecules; (b) contacting said population of nucleic acid-protein fusionmolecules with a Bcl-X_(L) polypeptide under conditions that allowinteraction between the protein portion of a nucleic acid-protein fusionmolecule of said population and said Bcl-X_(L) polypeptide; (c)detecting an interaction between said protein portion and said Bcl-X_(L)polypeptide, thereby identifying a Bcl-X_(L)-binding polypeptide, 10.The method of claim 9, wherein said population of source labeled nucleicacid-protein fusion molecules is derived from more than one source. 11.The method of claim 9, wherein, in step (a), said nucleic acid-proteinfusion molecules are detectably-labeled.
 12. The method of claim 11,wherein, in step (b), said Bcl-X_(L) polypeptide is immobilized on asolid support; and wherein, in step (c), the detection of an interactionbetween said protein portion of a nucleic acid-protein fusion moleculeand said Bcl-X_(L) polypeptide is carried out by detecting the labelednucleic acid-protein fusion molecule bound to said solid support. 13.The method of claim 12, wherein said solid support is a chip or a bead.14. A method of identifying a compound that modulates binding between aBcl-X_(L) polypeptide and a Bcl-X_(L)-binding polypeptide, said methodcomprising the steps of: (a) contacting a Bcl-X_(L) polypeptide with (i)a Bcl-X_(L)-binding polypeptide, said Bcl-X_(L)-binding polypeptideconsisting of the sequence of any of SEQ ID NOS: 4-50, 63-71, and224-228, and (ii) a candidate compound, under conditions that allowbinding between said Bcl-X_(L) polypeptide and said Bcl-X_(L)-bindingpolypeptide; (b) determining the level of binding between said Bcl-X_(L)polypeptide and said Bcl-X_(L)-binding polypeptide, wherein an increaseor decrease in the level of binding between said Bcl-X_(L) polypeptideand said Bcl-X_(L)-binding polypeptide, relative to the level of bindingbetween said Bcl-X_(L) polypeptide and said Bcl-X_(L)-bindingpolypeptide in the absence of said candidate compound, indicates acompound that modulates the binding between a Bcl-X_(L) polypeptide anda Bcl-X_(L)-binding polypeptide.
 15. A method of identifying a compoundthat modulates binding between a Bcl-X_(L) polypeptide and aBcl-X_(L)-binding polypeptide, said method comprising the steps of: (a)contacting a Bcl-X_(L) polypeptide with (i) a Bcl-X_(L)-bindingpolypeptide, said Bcl-X_(L)-binding polypeptide comprising the sequenceof any of SEQ ID NOS: 51-62, 229, and 230, and (ii) a candidatecompound, under conditions that allow binding between said Bcl-X_(L)polypeptide and said Bcl-X_(L)-binding polypeptide; (b) determining thelevel of binding between said Bcl-X_(L) polypeptide and saidBcl-X_(L)-binding polypeptide, wherein an increase or decrease in thelevel of binding between said Bcl-X_(L) polypeptide and saidBcl-X_(L)-binding polypeptide, relative to the level of binding betweensaid Bcl-X_(L) polypeptide and said Bcl-X_(L)-binding polypeptide in theabsence of said candidate compound, indicates a compound that modulatesthe binding between a Bcl-X_(L) polypeptide and a Bcl-X_(L)-bindingpolypeptide.
 16. The method of claim 14 or 15, wherein saidBcl-X_(L)-binding polypeptide is part of a nucleic acid-protein fusionmolecule.
 17. The method of claim 14 or 15, wherein, in step (a), saidBcl-X_(L) polypeptide is attached to a solid support.
 18. The method ofclaim 17, wherein said Bcl-X_(L)-binding polypeptide isdetectably-labeled; and, in step (b), said level of binding between saidBcl-X_(L) polypeptide and said Bcl-X_(L)-binding polypeptide isdetermined by measuring the amount of Bcl-X_(L)-binding protein thatbinds to said solid support.
 19. The method of claim 17, wherein saidsolid support is a chip or a bead.
 20. A method of source-labeling anucleic acid-protein fusion molecule, said method comprising the stepsof: (a) providing an RNA molecule; (b) generating a first cDNA strandfrom said RNA molecule; (c) generating a second cDNA strandcomplementary to said first cDNA strand, wherein said second cDNA strandcomprises a nucleic acid sequence that identifies the source of said RNAmolecule; (d) generating an RNA molecule from the double stranded cDNAmolecule of step (c) (e) attaching a peptide acceptor to said RNAmolecule of step (d); (f) in vitro translating said RNA to generate asource labeled nucleic acid-protein fusion molecule.
 21. Asource-labeled nucleic acid-protein fusion molecule, said nucleic acidportion of said fusion molecule comprising a coding sequence for saidprotein and a label that identifies the source of said nucleic acidportion.
 22. A method of identifying the source of the nucleic acidportion of a nucleic acid-protein fusion molecule, said methodcomprising the steps of: (a) providing a population of nucleicacid-protein fusion molecules, said molecules comprising a source labelthat identifies the source of the nucleic acid portion of said nucleicacid-protein fusion molecules; and (b) determining the identity of saidsource label, thereby identifying the source of the nucleic acid portionof a nucleic acid protein fusion molecule.
 23. The method of claim 22,wherein said source label is cell type-specific.
 24. The method of claim22, wherein said source label is tissue-specific.
 25. The method ofclaim 22, wherein said source label is species-specific.
 26. The methodof claim 22, wherein said population of nucleic acid-protein fusionmolecules contains subpopulations of nucleic acid-protein fusionmolecules from a plurality of sources.